Adoption of Twitter's New Length Limit: Is 280 the New 140?
AAdoption of Twitter’s New Length Limit: Is 280 the New 140?
Kristina Gligori´c
Ashton Anderson
University of [email protected]
Robert West
Abstract
In November 2017, Twitter doubled the maximum allowedtweet length from 140 to 280 characters, a drastic switch onone of the world’s most influential social media platforms.In the first long-term study of how the new length limit wasadopted by Twitter users, we ask: Does the effect of the newlength limit resemble that of the old one? Or did the dou-bling of the limit fundamentally change how Twitter is shapedby the limited length of posted content? By analyzing Twit-ter’s publicly available 1% sample over a period of around 3years, we find that, when the length limit was raised from 140to 280 characters, the prevalence of tweets around 140 char-acters dropped immediately, while the prevalence of tweetsaround 280 characters rose steadily for about 6 months. De-spite this rise, tweets approaching the length limit have beenfar less frequent after than before the switch. We find widelydifferent adoption rates across languages and client-devicetypes. The prevalence of tweets around 140 characters beforethe switch in a given language is strongly correlated with theprevalence of tweets around 280 characters after the switch inthe same language, and very long tweets are vastly more pop-ular on Web clients than on mobile clients. Moreover, tweetsof around 280 characters after the switch are syntactically andsemantically similar to tweets of around 140 characters be-fore the switch, manifesting patterns of message squeezingin both cases. Taken together, these findings suggest that thenew 280-character limit constitutes a new, less intrusive ver-sion of the old 140-character limit. The length limit remainsan important factor that should be considered in all studiesusing Twitter data.
On 7 November 2017, Twitter suddenly and unexpectedlyincreased the maximum allowed tweet length from 140 to280 characters, thus altering its signature feature. Accord-ing to Twitter, this change, which we henceforth refer to as“the switch”, was introduced to give users more space to ex-press their thoughts, as a disproportionately large fraction oftweets had been exactly 140 characters long (Rosen 2017a;Gligori´c, Anderson, and West 2018). Understanding theconsequences of the switch is of paramount importance forsocial media studies, for two reasons: first, because Twitteris one of the leading social media platforms (Perrin and An-derson 2019), with content posted there reaching and affect- ing billions of people across the globe; and second, becausethe constraints that a medium imposes affect the audiencenot only through the content delivered over the medium, butalso through the characteristics of the medium itself, or, in amantra coined by Marshall McLuhan (1964), “the mediumis the message”. Thus, anybody in whose research Twitterplays a role cannot ignore the switch, and must consider howthe new length limit has impacted Twitter as a platform.Early work that studied Twitter users’ attitudes toward thenew 280-character limit (Rimjhim and Chakraborty 2018)discovered varying initial reactions ranging from anticipa-tion, surprise, and joy to anger, disappointment, and sadness.Early studies also revealed a low initial prevalence of longtweets immediately following the switch (Perez 2017) andstudied the short-term impact that the length limit had onlinguistic features and engagement (Gligori´c, Anderson, andWest 2018; Boot et al. 2019), also in the specific context ofpolitical tweets (Jaidka, Zhou, and Lelkes 2019).Whereas the above-cited, early studies necessarily had toconsider short-term effects, much less is known about thelong-term effects of the switch. Now, nearly three years af-ter the switch, the present paper constitutes the first attemptto bridge this gap with a long-term study spanning severalyears. Broadly, our research was guided by the question,
Is280 the new 140?
In other words, does the effect of the newlength limit resemble that of the old one, just over a broaderrange of potential tweet lengths? Or has the doubling of thelimit fundamentally changed how Twitter is shaped by thelimited length of posted content?
Research questions.
Concretely, we address the followingresearch questions:
RQ1:
How did tweet length change in the two years fol-lowing the switch?
RQ2:
How did tweet length change across languages?
RQ3:
How did tweet length change across devices (Webvs. mobile clients vs. automated sources)?
RQ4:
What are the syntactic and semantic characteristicsof long vs. short tweets?
Summary of main findings.
Using a 1% sample of alltweets spanning the period from 1 January 2017 to 31 Octo-ber 2019, we conduct an observational study of tweet length a r X i v : . [ c s . S I] S e p
70 140 210 280 number of characters2019-102019-92019-82019-72019-62019-52019-42019-32019-22019-12018-122018-112018-102018-92018-82018-72018-62018-52018-42018-32018-22018-12017-12(the switch) 2017-112017-102017-92017-82017-72017-62017-52017-42017-32017-22017-1
Monthly tweet length histogram
Figure 1: Monthly tweet length histograms, normalized.over time (RQ1). Once the length limit was doubled and theold 140-character limit became obsolete, we find a declinein the prevalence of tweets of exactly or just under 140 char-acters, and a rise in the prevalence of tweets of exactly orjust under 280 characters, with a smooth adaptation phaseof around 6 months.Comparing languages (RQ2), we find vastly different lev-els of adoption patterns. The prevalence of 140 charactersbefore the switch is strongly correlated with the prevalenceof 280 characters after the switch, indicating that some lan-guages have an inherent affinity to longer messages. Com-paring device types (RQ3), tweets above 140 characters areused more on Web clients, compared to mobile clients. Au-tomated sources and third-party applications were the slow-est to adapt. They continue to tweet around 140 charactersdisproportionately more often, compared to regular users,and in general, tend to publish longer tweets.Finally, we observe that 280-character tweets are syn-tactically and semantically similar to 140-character tweetsposted before the switch (RQ4). The 280-character tweetsshow linguistic fingerprints indicative of “message squeez-ing”, with inessential parts of speech (e.g., fillers, adverbs,conjunctions) being relatively less frequent, and essentialparts of speech (e.g., verbs, negations) being relatively morefrequent, compared to shorter tweets. Their usage is ad-ditionally associated with specific topics, such as Money,Death, Work, and Religion. In a nutshell, although the doubling of the length limiteliminated the drastic disproportion of tweets reaching themaximum length (e.g., 9% of English tweets used to beexactly 140 characters long before the switch), our resultsdemonstrate the emergence of a similar, though considerablyweaker, effect around 280 characters after the switch. Wehence answer the guiding question—
Is 280 the new 140? —in a nuanced way:
280 is a less intrusive 140.
These findingshave important implications for Twitter-based research, asthey show that, although the new limit is “felt less” by usersthan the old limit, 280 characters still constitutes an impact-ful length constraint that shapes the nature of Twitter.
Twitter communication and supporting features: stud-ies of use and emerging conventions.
Previous work hasextensively studied communication taking place on Twit-ter and the specific features that support them, most im-portantly: retweets (Boyd, Golder, and Lotan 2010), hash-tags (Wikström 2014; Page 2012), quotes (Garimella, We-ber, and De Choudhury 2016), and emojis (Pavalanathanand Eisenstein 2016). Previous work has also investigatedlinguistic conventions on Twitter, the patterns of their emer-gence (Kooti et al. 2012b; Kooti et al. 2012a), how usersalign to them in conversations (Doyle, Yurovsky, and Frank2016), how they diffuse (Centola et al. 2018; Chang 2010),and how they continuously evolve (Cunha et al. 2011).
Variations in usage and adoption of conventions and lin-guistic style.
Additionally, previous work has studied howpatterns of adoption of these features, as well as the linguis-tic style used on the platform more broadly, varies acrossnumerous dimensions (Shapp 2014), including gender (Ciot,Sonderegger, and Ruths 2013), political leaning (Sylwesterand Purver 2015), age and income (Flekova, Preo¸tiuc-Pietro,and Ungar 2016); but also within accounts (Clarke andGrieve 2019).
Length limit and the impact of message length on successand linguistic characteristics.
Previous work has studiedhow the imposed length constraint on Twitter and other mi-croblogging platforms affects the dialogues and the linguis-tic style (Zhou and Xu 2019; Jin and Liu 2017), and the suc-cess of the message measures through the received engage-ment (Tan, Lee, and Pang 2014; Gligori´c, Anderson, andWest 2018; Gligori´c, Anderson, and West 2019; Wang andGreenwood 2020; Wasike 2013). More broadly, philology,communication, education, and psychology scholars haveinvestigated conciseness and its benefits in many differentcontexts (Laib 1990; Vardi 2000; Sloane 2003).
Message framing on Twitter.
Many previous studies haveinvestigated the question of what wording makes messagessuccessful in online social media, often formulated as thetask of predicting what makes textual content become popu-lar (Berger and Milkman 2012; Guerini, Strapparava, andÖzbal 2011; Lamprinidis, Hardt, and Hovy 2018). In thespecific case of Twitter, in addition to characterizing howlanguage is used on the platform in general (Murthy 2012;Levinson 2011; Hu, Talamadupula, and Kambhampati 2013; M M M number of tweetsSwedishUrduChineseEstonianHaitian CreoleDutchPersianHindiPolishGermanItalianRussianThaiFrenchTurkishTagalogIndonesianKoreanArabicSpanishPortugueseEnglishJapanese l a n g u a g e Distribution of tweetsacross languages
Figure 2: Number of origi-nal tweets in the 1% sampleposted between January 1st2019 and October 31st 2019,across the 23 studied lan-guages where the switch hap-pened (in orange), and didnot happen (in blue) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - date0%2%4%6%8%10% F r a c t i o n Fraction of tweets, n_chars [136,140] the switch
Figure 3: Daily fraction (indicated with a circle), and 10-day rolling average (solid line) offaction of tweets that have between 136 and 140 characters (inclusive). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - date0%0.5%1%1.5%2% F r a c t i o n Fraction of tweets, n_chars [276,280] the switch
Figure 4: Daily fraction (indicated with a circle), and 10-day rolling average (solid line) offaction of tweets that have between 276 and 280 characters (inclusive). Note the differenty-axis scales.Eisenstein 2013), researchers have investigated the corre-lation of linguistic signals with the propagation of tweets(Artzi, Pantel, and Gamon 2012; Bakshy et al. 2011; Tan,Lee, and Pang 2014; Pancer and Poole 2016).
We use publicly available 1% sample of tweets, spanning theperiod between 1 January 2017, and 31 October 2019, avail-able on the Internet Archive. We consider original tweets(i.e., we discard retweets). There are between 1 and 1.5 mil-lion daily original tweets. In Figure 2 we show the exactnumber in total across languages.We study 23 biggest languages: three languages where theswitch did not happen: Japanese, Korean, and Chinese, andtwenty where it happened, each language with more than 2Mtweets in total. The switch did not happen in Japanese, Ko-rean, and Chinese because the 140 characters limit was notas restrictive as it was in other languages, since more infor-mation can be conveyed with the same number of characters(Rosen and Ihara 2017; Rosen 2017b). We note that the datais sampled at the community level. We stay at describing the https://archive.org/details/twitterstream community level change as opposed to user-level behaviorssince user-level information is incomplete (in expectation,we have 1% of tweets posted by a fixed user). Character counting.
We carefully count the number ofcharacters based on the official documentation . Tweetlength is counted using the Unicode normalization of thetweet text. The tweet text is selected from the tweet ob-ject using displayed text range information, discarding anyretweet tags, and leading @user mentions that are notcounted towards the length limit. The text content of a Tweetcan contain up to 140 characters (or Unicode glyphs) beforethe switch, or 280 after the switch. Emoji sequence usingmultiple combining glyphs counts as multiple characters.Glyphs used in Chinese, Japanese or Korean languages arecounted as one character before, and as two characters afterthe introduction of the 280 limit. Therefore, a Tweet com-posed of only CJK text can only have a maximum of 140 ofthese types of glyphs after the switch. %2%4%6% Hindi Urdu Persian Turkish German Swedish Dutch English
Italian Spanish Polish French Arabic Russian Chinese Indonesian
Estonian
Thai
Portuguese
Korean
Tagalog
Japanese
Haitian Creole
Jan-Oct 2017Jan-Oct 2019
Histogram of tweet lengths
Number of characters P r o b a b ili t y Figure 5: Tweet length histograms in the 23 studied languages, for the period before the 280 limit was introduced (in red), andafter it was introduced (in blue). The languages are sorted by the prevalence of 140 before the switch.
0% 0.3% 0.6% 0.9% 1.2%Prevalence of 280 after the switch0%2%4%6% P r e v a l e n c e o f b e f o r e t h e s w i t c h hiurfatr desvnlen itesplfrar ruzh in etthptko tlja ht Hindi (hi)Urdu (ur)Persian (fa)Turkish (tr)German (de)Swedish (sv)Dutch (nl)English (en)Italian (it)Spanish (es)Polish (pl)French (fr)Arabic (ar)Russian (ru)Chinese (zh)Indonesian (in)Estonian (et)Thai (th)Portuguese (pt)Korean (ko)Tagalog (tl)Japanese (ja)Haitian Creole (ht) Figure 6: Fraction of tweets that are exactly 280 characterslong after the switch, on x-axis, and fraction of tweets thatare exactly 140 characters long before the switch, on y-axis(Spearman rank correlation 0 . p = . ∗ − ). First, we measure the prevalence of different tweet lengthsover time. We start by inspecting monthly histograms oftweet lengths, shown in Figure 1, where across 20 studiedlanguages we visualize the distribution in red for the monthsbefore the switch, and in blue for the months after.We observe a sharp decline of 140-character tweets andan increase in 280-character tweets after the switch. Other-wise, the first peak, consistently between 25 and 30 charac-ters, remains unchanged (i.e., the mode of the distribution isstable). https://developer.twitter.com/en/docs/counting-characters Next, we focus on these two interesting ranges, close to140 and close to 280, and do a more granular daily analy-sis. We monitor the daily fraction of tweets that either hitthe character limit or are within a five-character margin ofit. The daily fraction of tweets that have between 136 and140 characters are shown in Figure 3, and the daily fractionof tweets between 276 and 280 characters in Figure 4. Herewe consider again original tweets across the 20 studied lan-guages, where the 280 character limit was introduced. Graystripes indicate omitted days with missing data.We note that while 140-character tweets became lessprevalent after the switch, dropping from 7.0% to 1.2% overthe studied period, the prevalence of 280-character tweetssteadily increased for 6 months after its introduction, reach-ing 1.5% at the end of the studied period.
Next, we seek to characterize the adoption across the studiedlanguages. We start by examining in Figure 5 histograms oftweet lengths in red Jan-Oct 2017 as pre-switch period, andin blue the same months 2 years later when the things settledin Jan-Oct 2019, as post-switch period. Here we restrict our-selves to tweets posted from regular sources (web and mo-bile interface, rather than third-party applications and auto-mated sources). Similar to the overall view, across languagesthe first peak and the mode of the distribution are constant,and the interesting character length ranges are near 140 andnear 280 characters.In Figure 6 we show the fraction of tweets that are exactly280 characters long after the switch, on the x-axis, and thefraction of tweets that are exactly 140 characters long beforethe switch, on the y-axis.We observe that the prevalence of 140 before the switchin a language is correlated with the prevalence of 280 after.The more 140 was used in a language before the switch, the %2%4%6%
Hindi Urdu Persian Turkish German Swedish Dutch English
Italian Spanish Polish French Arabic Russian Chinese Indonesian - - - - - - - - - - - - - - Estonian - - - - - - - - - - - - - - Thai - - - - - - - - - - - - - - Portuguese - - - - - - - - - - - - - - Korean - - - - - - - - - - - - - - Tagalog - - - - - - - - - - - - - - Japanese - - - - - - - - - - - - - - Haitian Creole the switch
Fraction of tweets n_chars [276,280] date F r a c t i o n Figure 7: Daily evolution of 280 limit adoption across languages. Daily fraction (indicated with a circle), and 10-day rollingaverage (solid line) of faction of tweets that have between 276 and 280 characters (inclusive). The languages are sorted byprevalence of 140 before the switch.more 280 is used after (Spearman rank correlation 0 . p = . ∗ − ). In Hindi and Urdu, 280 is very prevalent, andit is the mode of the distribution–the most frequent characterlength after the switch.In Figure 7, we further monitor the evolution of adoptionpatterns of the new limit across languages, for tweets postedfrom web and mobile sources. In most of the languages,the prevalence seems to have settled, and is even decreasingagain, i.e., the peak of usage is past. Urdu is a notable excep-tion, where the prevalence is still growing, and the adoptionrate is still not in a stable state. Differences in differences estimation of the effect ofswitch.
Additionally, we take advantage of the fact that thenew limit was not introduced in all languages to performa differences in differences estimation of the effect of theswitch on tweet lengths.To go beyond visual inspection and to account for pos-sible global platform-wide changes that are not associatedwith the switch, we use a differences in differences regres-sion estimation, where the tweet lengths in Japanese, Koreanand Chinese, languages where the 280 characters were notintroduced are the control timeseries, and the tweet lengthsin the other 20 studied languages (Figure 2) are the treatedtimeseries. Both are observed in the pre-switch (Jan-Oct2017), and post-switch (Jan-Oct 2019) periods, as illustratedin Figure 8. We fit a model y ∼ treated ∗ period , (1)where the dependent variable y is the logarithm of the aver-age tweet length for each studied calendar day, and as inde-pendent variables are the following two factors: treated (indicates whether the switch was introduced or not inthose languages), period (indicates whether a calendar dayis in year pre-switch or post-switch). treated ∗ period is shorthand notation for α + β treated + γ period + δ treated : period + (cid:15) , where in turn treated : period stands for the interaction of treated and period .The interaction term treated : period δ is then the ef-fect of switch on the logarithm of average tweet length. Eachstudied pre- or post-switch period spans 277 days per condi-tion, amounting to a total of 4 × = δ . Fitting the model 1,we measure a e δ − = e . − = .
16% (95% CI [5.68%,6.64%]) increase in tweet lengths in the languages where theswitch happened, over the control baseline.To conclude, the estimate a significant increase in tweetlengths in languages where the switch happened, comparedto the control languages.
To understand the adoption of the new limit across differ-ent devices, we monitor the evolution of the daily fractionof tweets in interesting tweet lengths separately across web,mobile, and automated sources and third-party applications,in Figure 9. The tweets are tweeted in the 20 languageswhere the switch happened.We observe different adoption patterns between web andmobile. Longer tweets are used more on the web. In the webinterface, where 140 was most prevalent (around 12%), 140was quickly surpassed by 280, reaching around 4% at theend of the studied period. Automated sources and third-partyapplications the slowest to adapt.Tweet length of 280 characters and the near long lengthssurpassed 140 around June 2018 ( 7 months after the switch)for mobile, but for web immediately after the switch. a n O c t J a n O c t a v e r a g e t w ee t l e n g t h
280 introduced(20 languages)280 not introduced(Japanese, Korean,and Chinese)
Figure 8: Differences in differences setup for estimation ofthe effect of the introduction of 280 character limit on tweetlengths. For the pre- (left) and post-switch (right) periods,we monitor the daily average tweet length, in the languageswhere the 280 limit is introduced (in orange), and in the lan-guage where it is not introduced (in blue). Dashed lines markthe averages of the daily average tweet length in the fourconditions.
Differences in differences estimation of the effect ofswitch on tweets posted from different devices.
To under-stand the impact of the switch on tweets posted from differ-ent devices, we fit a slightly different model y ∼ treated ∗ period ∗ source , (2)where the source is a categorical variable representingmobile devices, web devices, or automated sources andthird-party applications. By analogy to Equation 1, wethen isolate the total effect of switch on tweet lengthposted by specific sources as the sum of the baselineeffect of the switch and the source-specific switch ef-fect, treated : period + treated : period : source . Fit-ting the model from Equation 2, we consistently measure alargest increase in tweet length of 17.46% (95% CI [16.54%,18.39%]) for web, followed by 9.76% (95% CI [8.88%,10.66%]) for automated sources and third-party applica-tions, and the smallest of 5.64% (95% CI [5.19%, 6.08%])for mobile.Next, we further investigate tweets posted by automatedsources and third-party applications. This is content likelygenerated by bots or otherwise automated applications, asopposed to the content generated by regular users. We showfor five biggest languages where 280 was introduced (En-glish, Portuguese, Spanish, Arabic, and Indonesian), in Fig-ure 10, tweet length histograms of tweets posted by third-party applications and automated sources. The probabilityis normalized relative to the baseline, the probability ofthe same character length in the same language for regu-lar sources (web and mobile). For example, +200% meansthat a tweet of the given length is two times more likely tobe observed among the tweets posted by automated sources,compared to the regular users. Again, we compare Jan-Oct2017 and 2019 as pre- and post-periods. The patterns looksimilar across languages. Automated sources and third-partyapplications tweet longer tweets before the switch, and posttweets in longer length ranges after the switch. The inflictionpoint at around 70 characters remains. Automated sources F a c t i o n Mobile
Fraction of tweets, n_chars [136,140]Fraction of tweets, n_chars [276,280]the switch0%4%8%12% F a c t i o n Web
Fraction of tweets, n_chars [136,140]Fraction of tweets, n_chars [276,280]the switch - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - date0%4%8%12% F a c t i o n Automated sources and third-party applications
Fraction of tweets, n_chars [136,140]Fraction of tweets, n_chars [276,280]the switch
Figure 9: Daily fraction (indicated with a circle), and 10-day rolling average (solid line) of faction of tweets that havebetween 276 and 280 characters (blue), and between 136 and140 characters (red). The evolution is shown separately forMobile applications, Web interface, and Automated sourcesand third-party applications.and third-party applications have a lingering peak at 140 af-ter the switch in English, Arabic, and Indonesian.
Lastly, we aim to provide deeper insights into the nature oflong tweets tweeted after the switch. Why do users tweetlong tweets? What are their signature characteristics? To an-swer those questions, we examine the content of the tweets.We study tweets in English that were posted from mobileand web devices during the two previously introduced pre-(75.56M tweets) and post-switch (65.29M tweets) periods.We annotated the tweets with LIWC (Pennebaker, Booth,and Francis 2007) syntactic features (linguistic categories)and semantic features (psychological, biological, and socialcategories).First, to characterize syntactic features of tweets, for alltweets with a given number of characters, we measure theoccurrence frequency of different syntactic features–part ofspeech (POS) tags among tweets of that length. In Fig-ure 11, across all possible tweet lengths in the period be-fore the switch ( [ − ] characters), we observe the frac-tion of tweets in that length that have at least one POS tag.The dashed black line represents this quantity across tweetsposted in the period before the switch (under 140 charac-ter limit), and the solid colored line represents this quantityamong tweets posted in the period after the switch (under280 character limit).
70 140 210 280-200%-150%-100%-50%mobile and web+50%+100%+150%+200%+250%+300% E n g li s h Jan-Oct 2017
Jan-Oct 2019 P o r t u g u e s e Sp a n i s h A r a b i c I n d o n e s i a n Tweet legth probability for automated sources and third-party applications
Number of characters
Figure 10: Tweet length probability for tweets posted by au-tomated sources and third-party applications, before the 280limit was introduced (on the left), and after (on the right).The probability is normalized relative to the baseline, theprobability of same length in the same language for regularsources: web and mobile.Comparing the solid colored and dashed black linesacross different POS tags lets us isolate the effect of thelength limit on the content of the tweets. The largest gapis observed for swear words and spoken categories (non-fluencies, fillers, and assent), adverbs, and conjunctions–nonessential parts of speech that are deleted in the process of“squeezing in” a message to fit a length limit. No gap is ob-served for verbs and negations, essential parts of speech thatare known to be disproportionately preserved in the shorten-ing process (Gligori´c, Anderson, and West 2019).Figure 12 represents the same quantities after the switch(under 280 character limit) across all possible tweet lengthsin this period ( [ − ] characters). We note that there isno counterfactual observation, i.e., we do not know whatthe probability of observing a POS tag among 280 characterlong tweets would be if the 280 character length limit waslifted. Nonetheless, among the 280 character long tweets af-ter the switch, we do observe patterns similar to those asso-ciated with the 140 character long tweets before the switch. Adverbs
Article
Assent
AuxVb
Conj
Filler
Negate
Nonflu
Numbers
Prep
Pronoun
Quant
Swear
Verbs
Figure 11: Occurrence frequency of POS tags across tweetsin different character lengths possible in the period beforeswitch ( [ − ] characters). The dashed black line repre-sents this quantity across tweets posted in the period beforethe switch (under 140 character limit), and the solid coloredline in the period after switch (under 280 character limit). Adverbs
Article
Assent
AuxVb
Conj
Filler
Negate
Nonflu
Numbers
Prep
Pronoun
Quant
Swear
Verbs
Figure 12: Occurrence frequency of POS tags across tweetsposted in the period after the switch (under 280 characterlimit) in different allowed character lengths ( [ − ] char-acters). POS tags are sorted alphabetically. Note the differ-ent y-axis scales.We observe a “dip” in the frequency of the spoken cate-gories, conjunctions, and numbers among the 280 character-long tweets, traces typical of “optimizing” a message to fit alength limit.In Figures 13 and 14 we measure the same quantitiesfor fine-grained subtypes of personal pronouns. This sug-gests that personal pronouns I and you were most affectedby the 140 character limit (i.e., they were most likely to beomitted). We observe a similar non-monotonic distributionaround 280 characters in Figure 14.To summarize, 280 character tweets are syntactically sim-ilar to 140 tweets before the switch: their usage is associatedwith patterns that are indicative of “squeezing in” a message.This is evidence indicating that they are generated by similarwriting processes as 140 character tweets were.Next, in Figure 15 we study topics of tweets, measured byLIWC categories describing psychological, biological, andsocial categories. Across the studied topics, for each allowedcharacter length in the pre-switch and post-switch periods,we measure the factor by which the topic is more frequentin a given character length, compared to the overall topicfrequency. Categories are sorted by the value of this factor I SheHe
They We You
Figure 13: Occurrence frequency of personal pronounsacross tweets in different character lengths possible in theperiod before switch ( [ − ] characters). The dashedblack line represents this quantity across tweets posted inthe period before the switch (under 140 character limit), andthe solid colored line in the period after switch (under 280character limit). I SheHe
They We You
Figure 14: Occurrence frequency of personal pronounsacross tweets posted in the period after the switch (under280 character limit) in different allowed character lengths( [ − ] characters). Personal pronouns are sorted alpha-betically. Note the different y-axis scales.at 140 characters before the switch, and at 280 charactersafter the switch.We note the personal concerns categories that were rel-atively most frequent at 140 characters before the switch:Work, Money, Achievement, Death, and Religion. The leastfrequent topics at 140 characters, on the other hand, wereBiological categories: Sexual, Body, Ingestion, followed byFamily, Friends, and Leisure. An apparent association withthe importance of the message emerges: while tweets abouttopics related to ordinary, overall more prevalent every-dayexperiences use the longer tweets the least frequently, top-ics related to more serious personal concerns use them themost. Similarly to within-language, there is a within-topiccorrelation between usage of 140 character length before theswitch, and subsequent usage of 280 character length afterthe switch, with the ranking of the topics usage at the bound-ary length only slightly changed (Spearman rank correlationbetween topics 0 . p = . ∗ − ), implying that 280character tweets are also semantically similar to 140 tweetsbefore the switch. To summarize, immediately after the switch we observe asharp decline in the frequency of 140 characters, and an in-crease in the frequency of tweets long 280 characters (Fig-ure 1). As 140 characters were becoming less prevalent af-ter the switch, the prevalence of 280 was increasing af-ter its introduction over a period of around 6 months (Fig- F a c t o r b y w h i c h t h e t o p i c i s m o r e f r e q u e n t a m o n g t w ee t s b e f o r e t h e s w i t c h , c o m p a r e d t o a m o n g a ll t w ee t s WorkMoneyAchievDeathReligHomeHumansHealthLeisureFriendsFamilyIngestBodySexual F a c t o r b y w h i c h t h e t o p i c i s m o r e f r e q u e n t a m o n g t w ee t s a f t e r t h e s w i t c h , c o m p a r e d t o a m o n g a ll t w ee t s MoneyDeathWorkReligHomeHealthAchievHumansFamilyFriendsLeisureBodyIngestSexual
Figure 15: For each allowed number of characters before theswitch (left), and after the switch (right), across 14 topicsmeasured with LIWC categories, we monitor the factor bywhich the topic is more frequent among tweets with thatnumber of characters, compared to overall frequency. Cate-gories are sorted by the value of this factor at 140 characters(left), and 280 characters (right).ures 3 and 4). Taking advantage of the emergence of differ-ing length restriction among languages, we estimate a signif-icant increase in tweet lengths in languages where the switchhappened, compared to the control languages (Figure 8).We note, however, that tweet lengths increased slightly inthe control languages as well. This is likely impacted by thenature of how tweet length is counted at the character level(as described in the Data Section), allowing mixed-charactertweets to be longer than 140 characters.We observe that the more 140 was used in a language be-fore the switch, the more 280 is used after (Figure 6). Disag-gregation across languages reveals interesting temporal pat-terns (Figure 7): In most languages, the prevalence of longtweets seems to have settled, and is even decreasing again,i.e., the peak of usage, or the “honeymoon phase” is over.This is indicating the presence of a period of high usagerates of the new feature, followed by a drop and saturationto a constant level.Different adoption patterns are observed between web andmobile devices (Figure 9). Longer tweets are particularlyused on web clients. In the web interface, 140 was quicklysurpassed by 280, reaching around 4% at the end of the stud-ied period. While slower adoption rates on mobile devicescould conceivably be linked to clients that did not update,the fact that tweets were longer on web interface before theswitch indicates that there is simply a tendency for shortertext on mobile phones. Automated sources and third-partyapplications are the slowest to adapt. They posted longertweets before the switch, and post tweets in longer lengthranges after the switch (Figure 10). It is interesting to notethat automated sources and third-party applications writelonger than humans, but the effect is weaker at the boundary(just under 140 before and just under 280 after the switch),probably due to the “squeezing” of originally longer tweetsthat humans do.weets long 280 characters are syntactically similar to140 tweets before the switch: their usage is associated withpatterns that are indicative of “squeezing in” a message (Fig-ures 11 and 12). This is evidence indicating that tweets closeto the new boundary are generated by similar writing pro-cesses as 140 character tweets were before the switch. Sim-ilarly to within-language, there is a within-topic correlationbetween usage of 140 character length before the switch, andsubsequent usage of 280 character length after the switch(Figure 15). Tweets 280 characters long are semanticallysimilar to 140 tweets before the switch: their usage is as-sociated with the same topics. We note that this holds acrossthe range of long character lengths, and not only for 280character long tweets specifically.Given these findings, our guiding question—
Is 280 thenew 140? —calls for a nuanced answer. On the one hand,the emergence of the peak in the prevalence of 280 char-acters, the fact that the languages that came close to the140-limit also tend to come closer to the 280-limit, and thetraces of distinctive writing processes when “squeezing” amessage, absent in automated sources and third-party appli-cations (Figure 10) all resonate with a narrative stating that,yes, 280 is the new 140. On the other hand, the prevalenceof 280 is much less drastic than that of 140 used to be (Perez2017)—only 4% for 280 characters after the switch, com-pared to over 12% for 140 characters before the switch (fortweets posted by Web clients). In this sense, while 280 mayindeed be considered the new 140, it is at the same time lessnoticeable: in a nutshell,
280 is a less intrusive 140.
Finally, this evidence suggests that, just as the old 140-character limit (Gligori´c, Anderson, and West 2018), thenew 280-character limit impacts the writing style and con-tent of tweets (Sen et al. 2019). The length constraint andthe resulting tweet-length distribution remain an importantdimension to consider in studies using Twitter data, as afterthe switch the number of characters remains an importantvariable, correlated with important properties of tweets in-cluding topics, language, device, and the likelihood of beingan automated source of tweets.
Limitations.
This study suffers from limitations that the 1%stream is known to be susceptible to (Wu, Rizoiu, and Xie2020), as certain accounts might be over-represented due tothe intentional or unintentional tampering with the SampleAPI (Pfeffer, Mayer, and Morstatter 2018; Morstatter et al.2013).In our study, we use automated sources and third-partyapplications as a proxy for bots. However, bot detection canbe more reliable using more sophisticated methods detect-ing bots that use regular applications (Chavoshi, Hamooni,and Mueen 2016; Kudugunta and Ferrara 2018; Yang et al.2020). Syntactic and semantic analysis of tweets is limitedto tweets in English only, due to the lack of available toolsto support annotation. Future studies should measure thesecharacteristics in other languages, using language-specifictools, or machine translation. Lastly, we note that in thisstudy, we study Twitter as a platform (tweets are sampled atthe community level), as opposed to users, whose timelinesare incomplete.
Future work.
Future work should provide a better under-standing of what user features are associated with adoption(e.g., users’ age, number of followers, levels of activity). Arethe users who used 140 the same ones who are more likelyto use 280? Answering this question requires collection ofdata beyond 1% sample, that contains complete records ofusers’ tweets. However, here we caution against naive com-parisons, as careful quasi-experimental designs are neces-sary to truly isolate the effect of age. User age is corre-lated with other factors–users who stay longer on the plat-form might be in other ways fundamentally different fromyounger users who joined more recently (i.e., there is “sur-vivor bias” (Elton, Gruber, and Blake 1996)). Finally, ourstudy should be replicated several years from today, wheneven more time has passed since the switch.
References
Artzi, Y.; Pantel, P.; and Gamon, M. 2012. Predicting responsesto microblog posts. In
Proc. Conference of the North AmericanChapter of the Association for Computational Linguistics .Bakshy, E.; Hofman, J. M.; Mason, W. A.; and Watts, D. J. 2011.Everyone’s an influencer: Quantifying influence on twitter. In
Proc.ACM International Conference on Web Search and Data Mining .Berger, J., and Milkman, K. L. 2012. What makes online contentviral?
Journal of Marketing Research
PalgraveCommunications , 1–10. IEEE.Centola, D.; Becker, J.; Brackbill, D.; and Baronchelli, A. 2018.Experimental evidence for tipping points in social convention.
Sci-ence
Proceedings of the American Societyfor Information Science and Technology
Icdm , 817–822.Ciot, M.; Sonderegger, M.; and Ruths, D. 2013. Gender inferenceof twitter users in non-english contexts. In
Proceedings of the 2013Conference on Empirical Methods in Natural Language Process-ing , 1136–1145.Clarke, I., and Grieve, J. 2019. Stylistic variation on the donaldtrump twitter account: A linguistic analysis of tweets posted be-tween 2009 and 2018.
PloS one
Proceedingsof the workshop on language in social media (LSM 2011) , 58–65.Doyle, G.; Yurovsky, D.; and Frank, M. C. 2016. A robust frame-work for estimating linguistic alignment in twitter conversations.In
Proceedings of the 25th international conference on world wideweb , 637–648.Eisenstein, J. 2013. What to do about bad language on the in-ternet. In
Proc. Conference of the North American Chapter of theAssociation for Computational Linguistics .Elton, E. J.; Gruber, M. J.; and Blake, C. R. 1996. Survivorbias and mutual fund performance.
The review of financial stud-ies
Proceedingsof the 54th Annual Meeting of the Association for ComputationalLinguistics (Volume 2: Short Papers) , 313–319.Garimella, K.; Weber, I.; and De Choudhury, M. 2016. Quoterts on twitter: usage of the new feature for political discourse. In
Proceedings of the 8th ACM Conference on Web Science , 200–204.Gligori´c, K.; Anderson, A.; and West, R. 2018. How constraintsaffect content: The case of twitter’s switch from 140 to 280 charac-ters. In
Proc. International Conference on Web and Social Media .Gligori´c, K.; Anderson, A.; and West, R. 2019. Causal effects ofbrevity on style and success in social media.
Proc. ACM Hum.-Comput. Interact.
Proc. International Conference onWeb and Social Media .Hu, Y.; Talamadupula, K.; and Kambhampati, S. 2013. Dude,srsly?: The surprisingly formal nature of twitter’s language. In
Proc. International Conference on Web and Social Media .Jaidka, K.; Zhou, A.; and Lelkes, Y. 2019. Brevity is the soul oftwitter: The constraint affordance and political discussion.
Journalof Communication
Poznan Studies in Contemporary Lin-guistics
Proceedings of the 21st ACM international conference on Informa-tion and knowledge management , 445–454.Kooti, F.; Yang, H.; Cha, M.; Gummadi, P. K.; and Mason, W. A.2012b. The emergence of conventions in online social networks.In
ICWSM .Kudugunta, S., and Ferrara, E. 2018. Deep neural networks for botdetection.
Information Sciences
College Composi-tion and Communication
Proc. Conference on Empirical Methods inNatural Language Processing .Levinson, P. 2011. The long story about the short medium: Twit-ter as a communication medium in historical, present, and futurecontext.
Journal of Communication Research
New York .Morstatter, F.; Pfeffer, J.; Liu, H.; and Carley, K. M. 2013. Is thesample good enough? comparing data from twitter’s streaming apiwith twitter’s firehose. arXiv preprint arXiv:1306.5204 .Murthy, D. 2012. Towards a sociological understanding of socialmedia: Theorizing twitter.
Sociology
Discourse & communication
Social In-fluence
FirstMonday . Pennebaker, J. W.; Booth, R. J.; and Francis, M. E. 2007.Liwc2007: Linguistic inquiry and word count.
Austin, Texas: liwc.net .Perez, S. 2017. Twitter’s doubling of character count from 140 to280 had little impact on length of tweets. https://cutt.ly/gfoIaY1.Perrin, A., and Anderson, M. 2019. Share of us adults using socialmedia, including facebook, is mostly unchanged since 2018.
PewResearch Center
EPJ Data Science
Proceedings of the 10thAnnual Meeting of the Forum for Information Retrieval Evaluation ,FIRE’18, 48–51. ACM.Rosen, A., and Ihara, I. 2017. Giving you more characters toexpress yourself. https://blog.twitter.com/official/en_us/topics/product/2017/Giving-you-more-characters-to-express-y\ourself.html.Rosen, A. 2017a. Tweeting made easier. https://goo.gl/DYzBji.Rosen, A. 2017b. Tweeting made easier. https://blog.twitter.com/official/en_us/topics/product/2017/tweetingmadeeasier.html.Sen, I.; Floeck, F.; Weller, K.; Weiss, B.; and Wagner, C. 2019. Atotal error framework for digital traces of humans. arXiv preprintarXiv:1907.08228 .Shapp, A. 2014. Variation in the use of twitter hashtags.
New YorkUniversity
TeachingEnglish in the Two Year College .Sylwester, K., and Purver, M. 2015. Twitter language use re-flects psychological differences between democrats and republi-cans.
PloS one
Proc. Annual Meeting of the Association forComputational Linguistics .Vardi, A. D. 2000. Brevity, conciseness, and compression in ro-man poetic criticism and the text of gellius’ noctes atticae 19.9. 10.
American Journal of Philology .Wang, S. A., and Greenwood, B. N. 2020. Does length impactengagement? length limits of posts and microblogging behavior.
Length Limits of Posts and Microblogging Behavior (February 12,2020) .Wasike, B. S. 2013. Framing news in 140 characters: How so-cial media editors frame the news and interact with audiences viatwitter.
Global Media Journal: Canadian Edition
SKY Journal of Linguistics
Proceedingsof the International AAAI Conference on Web and Social Media ,volume 14, 715–725.Yang, K.-C.; Varol, O.; Hui, P.-M.; and Menczer, F. 2020. Scal-able and generalizable social bot detection through data selection.In
Proceedings of the AAAI Conference on Artificial Intelligence ,volume 34, 1096–1103.Zhou, A., and Xu, S. 2019. Remaking dialogic principles for thedigital age: The role of affordances in dialogue and engagement.