Expecting to be HIP: Hawkes Intensity Processes for Social Media Popularity
Marian-Andrei Rizoiu, Lexing Xie, Scott Sanner, Manuel Cebrian, Honglin Yu, Pascal Van Hentenryck
EExpecting to be HIP: Hawkes Intensity Processes forSocial Media Popularity
Marian-Andrei Rizoiu † ] , Lexing Xie † ] , Scott Sanner ‡ Manuel Cebrian ] , Honglin Yu † ] , Pascal Van Henteryck [ † Australian National University, ] Data61 CSIRO, ‡ University of Toronto, [ University of Michigan
ABSTRACT
Modeling and predicting the popularity of online content isa significant problem for the practice of information dissem-ination, advertising, and consumption. Recent work ana-lyzing massive datasets advances our understanding of pop-ularity, but one major gap remains: To precisely quantifythe relationship between the popularity of an online itemand the external promotions it receives. This work suppliesthe missing link between exogenous inputs from public socialmedia platforms, such as Twitter, and endogenous responseswithin the content platform, such as YouTube. We developa novel mathematical model, the Hawkes intensity process,which can explain the complex popularity history of eachvideo according to its type of content, network of diffusion,and sensitivity to promotion. Our model supplies a proto-typical description of videos, called an endo-exo map . Thismap explains popularity as the result of an extrinsic factor– the amount of promotions from the outside world that thevideo receives, acting upon two intrinsic factors – sensitiv-ity to promotion, and inherent virality. We use this modelto forecast future popularity given promotions on a large5-months feed of the most-tweeted videos, and found it tolower the average error by 28.6% from approaches based onpopularity history. Finally, we can identify videos that havea high potential to become viral, as well as those for whichpromotions will have hardly any effect.
1. INTRODUCTION
The popularity of an online cultural item is described bythe amount of attention it receives, and the popularity dy-namics refers to its evolution over time. Popularity is a crit-ical measure of information dissemination for content pro-ducers, and a way to manage information overload for con-tent consumers. Understanding and predicting popularityhave been active topics in both research and practice, butmany fundamental questions remain open, such as: Whatdescribes the most viral items? What do the popularity dy-namics of news, music, films look like, and what are their © 2017 International World Wide Web Conference Committee(IW3C2), published under Creative Commons CC BY 4.0 License.WWW 2017, April 3–7, 2017, Perth, Australia.ACM 978-1-4503-4913-0/17/04. http://dx.doi.org/10.1145/3038912.3052650 differences and similarities? Can we promote an item toincrease its popularity, and how much promotion is needed?Building upon recent research progress in understandingpopularity, we identify three important questions that arestill open. The first one concerns modeling popularity. Oneset of approaches describe popularity dynamics as stylis-tic prototypes, such as being power-law shapes from eitheran exogenous shock or endogenous relaxation [13], a com-bination of power-law and exponential decay [24], multiplepower-law decays with periodicity [27] or a collection of re-currence peaks [10]. However, one question remains:
Howwould popularity evolve under continuous externalinfluence?
Especially, how one can explain complex riseand fall patterns that do not follow the prescribed proto-types. The second questions concerns virality. Content andinitial diffusion have both been identified as key factors thatinfluence popularity. Here content factors include positivesentiment [2], emotional arousal [5], publishing venue [3],visibility [6]; and factors of diffusion history include [9] net-work structure, information about the original poster andre-sharers, the timing of the early posts. However, describ-ing viral content in the light of external promotions is stillan open problem, and in particular:
Can something goviral if promoted?
The third questions involves predict-ing future popularity. It is known that the approaches thatuse the popularity history [30, 34] produce competitive esti-mates about future popularity over time. Also, timing fea-tures have been shown to be more predictable than content,structure, and user features [9], and prediction without ini-tial history is generally shown as a hard problem [26]. How-ever, these recent insights do not answer:
How to forecastfuture popularity given planned promotions?
In this work, we answer all three questions above, usinga large dataset that connects popularity in one social me-dia platform – 81.9 million YouTube videos – to discussionsabout each of these digital items in an external platform –in 1.06 billion tweets over a six-month period.To describe complex popularity dynamics under contin-uous external influence, we propose a new mathematicalmodel that reveals an analytical relationship between en-dogenous and exogenous demand factors, called the HawkesIntensity Process (HIP). HIP extends the well-known Hawkespoint-process [19], by taking the expectation over stochas-tic event histories so as to describe expected event volumes,rather than a set of event times. Figure 1 illustrates the HIPmodel. On the top left is the volume of exogenous promo-tions over time, which drives the endogenous response de-termined by the HIP (middle); the output on the right is the a r X i v : . [ c s . S I] S e p xogenous promotion Hawkes Point Process event intensity s(t) ξ (t) external events endogenous responseexpectation over event history ( t ) Hawkes Intensity Process ˆ ⇠ ( t ) Figure 1: Linking endogenous and exogenous factors of pop-ularity using the Hawkes Intensity Process.
Top row : Theinput are volumes of exogenous promotion or discussions s ( t ), that engender endogenous reactions from the onlinesocial networks described by the impulse response functionˆ ξ ( t ) (middle box, defined in Sec 2.5), to generate the totalpopularity series ξ ( t ). Bottom row : The endogenous reac-tions are self-exciting point processes, widely used in recentliterature [4, 23, 28, 31, 33, 39]. Here each event triggerssubsequent events with memory kernels φ ( t ). Such pointprocess models can incorporate individual external stimulus(show on the left) which in turn lead to a larger number ofevents in response (shown on the right). Middle arrow : Theproposed HIP model is a result of taking the expectationover all stochastic event history of the Hawkes process inthe bottom.popularity series. The popularity series modeled through theHawkes intensity process matches closely with the observedview count series, even for videos with complex popularitylifecycles (Section 2).To answer the second question, on whether or not an itemwill go viral if promoted, we derive two new metrics basedon HIP – the endogenous response and exogenous sensitivity.These two metrics naturally lend to a novel two-dimensionalvisualization tool, dubbed the endo-exo map (Section 4).On this map, one can identify online videos that have highpotential but are not yet popular. In other words, video withhigh sensitivity to external promotions and high endogenousresponse are expected to go viral if promoted. On the otherhand, one can also identify videos for which promotion isunlikely to have an effect, such as those scoring very low ineither the endo- or exo- dimension.Finally, the HIP model can be used to help forecast fu-ture popularity given (known or planned) promotions. HIPmodel parameters are estimated on the first 90 days of eachvideo’s history, and forecasts are made for the next 30 days.We evaluate forecasting on a collection of 13K+ most ac-tively discussed YouTube videos over a six-month period,and found that estimates made with the HIP lower the aver-age percentile error by 28.6% from state-of-the-art methodsbased on popularity history (Section 5).The main contributions of this work include: • The HIP model, a volume based version of the Hawkespoint process. Its essential novelty is to regard popular-ity as externally-driven, with exogenous events activat-ing endogenous responses inside the social environmentwhich may, or may not, amplify the exogenous signal. • The exogenous sensitivity and the endogenous response,two new metrics to quantify two distinct aspects of avideo’s inherent tendency to be popular. They are com- bined in the endo-exo map, a tool used to comparativelyexplain popularity and identify potentially viral videos. • A method to forecast popularity gain after promotion.Evaluated on a large set of YouTube videos, it signifi-cantly outperforms approaches using popularity history. • A new dataset of tweeted videos that links online videosto their external discussions, available at https://github.com/andrei-rizoiu/hip-popularity.
2. THE MODEL
We introduce a model for the evolution of online atten-tion under external influence. We start by discussing theproblem setting of aggregated attention under external pro-motion (in Sec. 2.1), the key concepts of the Hawkes processand its use to link the ongoing effect of external stimuli tothe word-of-mouth spread of attention (Sec. 2.2). Next, wepropose HIP, a model to explain the observed popularity his-tory from daily volumes when the underlying viewing eventsare unobserved (Sec. 2.3). Lastly, we introduce two key met-rics derived from the HIP model, the endogenous responseand exogenous sensitivity, to quantify the viral potential ofa video (Sec. 2.5).
This paper aims to model the popularity of videos un-der external promotion. Here popularity is measured in thenumber of total views after the video being online for a cer-tain number of days (e.g. up to 120 days). External pro-motion is harder to measure, since by definition, it needs tocapture data from other platforms. In this paper, we havetwo different views of promotion, due to the data collectionsetting described in Sec 3. The first is shares , tracked byYouTube via the share button under each video that allowsa user to share a link of the video on a selection of pop-ular social network sites – 13 at the time of this writing.The second view is tweets , tracked with twitter streamingAPI with keyword filters that retrieve tweets that link to avideo. Neither source is complete – with the distributed na-ture of the Internet, one can see that a complete capture ofall discussions is practically impossible. The shares capturesexternal promotions from a diverse set of sources, but is farfrom complete in any one source. The tweets captures analmost-complete feed of video promotions in one platform.In the rest of this paper, both of these sources are collec-tively referred to as external promotions about a video. Inour evaluations, the results obtained using each source arepresented separately.
We model online attention as an exogenously-driven self-exciting process – each viewing event is triggered either bya previous event or as a result of external influence. Weassume that viewing events of a YouTube video follow aHawkes point process [19], a type of non-homogeneous pointprocess in which the arrival of an event increases the likeli-hood of future events. Although variants of point processeshave recently been used to model events in social media, allexisting work focus on learning point process model fromone information source, such retweeting [39, 23], arrival ofcitations [33], or endogenous response after an initial exter-nal shock [13]. To the best of our knowledge, this is the firstwork that models the continuous interaction of two sources– exogenous stimuli and endogenous response. ( t ) May 2014 Jul 2014 Sep 2014 Nov 2014 Jan 2015
Observed s ha r e s − −
19 2014 − −
16 2014 − −
14 2014 − −
11 2014 − − µ=79.85,θ=5.37 c = 0.46, C =0.008γ =301.5, η=4336 A ξ=1.88 N u m be r o f s ha r e s − −
15 2014 − −
13 2014 − −
10 2014 − −
07 2014 − − (a) Examples of observed and fi tted popularity seriesbUORBT9iFKc WKJoBeeSWhc s ( t ) (b) Sliced fi tting graph N u m be r o f v i e w s µ = 42.85, θ = 0.41 c = 3.26, C = 0.95γ = 13.16, η = 4.35 10 −10 A ξ=1.72 10 Figure 2: Explaining popularity dynamics using the Hawkes intensity model. (a)
Number of shares (red), observed popularityhistory (black dashed) and popularity as explained by the HIP (blue) on two examples videos: a music video bUORBT9iFKc and a
News & Politics video
WKJoBeeSWhc . The multi-phased popularity history cannot be explained by current modelssuch as [13], while the HIP tracks the complex dynamics well. (b) A sliced fitting graph of a music video (Youtube ID ) – using the impulse response ˆ ξ ( t ) and exogenous stimuli s ( t ) to explain observed popularity. Each alternatinggray and white area under the fitted (blue) curve is a slice of endogenous reaction generated by the external influence in agiven day. The left inset zooms-in one of the early months in the video’s evolution, in May 2014. The total event intensity(blue solid line) is a sum of temporally shifted and scaled versions of ˆ ξ ( t ), which tracks the long-term trends in observedpopularity well (dashed line). The period around the first larger exogenous peak is shown magnified so that its correspondingendogenous response is clearly visible. (right inset) Example of the impulse response ˆ ξ ( t ) to one unit of external excitation.The area under this function, A ˆ ξ , quantifies the endogenous reaction of a video – it is the total number of views after eachunit of exogenous excitation.In particular, the arrival rate of viewing events λ ( t ), ameasure of how likely a viewing event will occur in a in-finitesimal interval around time t , is determined by two ad-ditive components in Eq (1). The first component is pro-portional to a measure of external influence s ( t ) scaled bya constant µ . Here s ( t ) represents the volume of externaldiscussion (or promotion) over time. The second compo-nent represents the rate of views triggered by a previousevent i , which occurred at time t i with magnitude m i > φ m i ( t − t i ).Furthermore, each event t i < t adds to λ ( t ) independently.The following equations describe the event rate of such a marked Hawkes process: λ ( t ) = µs ( t ) + X t i
0) is the power-law exponent for social memory –the larger θ is, the sooner the reaction to an event will stop.We use a power-law kernel for φ m ( τ ), as recent work [28] observed it to have better performance on social media datathan popular variants like the exponential kernel.This model is an instance of a marked Hawkes process [19].An illustration of the Hawkes process with external excita-tion is in the bottom row of Figure 1. A set of input events ofdifferent magnitudes trigger new events through the kernel φ ( t ), which then trigger offspring events themselves, result-ing in the observed event sequence. The Hawkes point-process faces a few modeling challengesin large-scale applications. In terms of data source, what weoften observe is the volume of total attention in a given in-terval (e.g. daily views on YouTube), rather than the timesand properties of individual actions, due to constraints inuser privacy and data volume. In terms of computation, fullestimation of the Hawkes process is quadratic in the num-ber of events. Therefore, the full estimation quickly becomesexpensive when the number of events is in the hundred thou-sands or millions – this is where the most popular videos are(see Sec 3.2). It is very desirable if one could estimate videopopularity with daily data, which is typically a few dozensto a few hundred data points.To this end, we introduce the Hawkes intensity ξ ( t ), theexpectation of the event rate λ ( t ) over the event history H t ,consisting of the set of (random) event times and magnitudesup to time t . Theorem
Hawkes Intensity Process (HIP)
Givena marked Hawkes process described in Equations (1) and (2).Its event history H t = { ( t , m ) , . . . , ( t n , m n ) } t n
016 from a large Twitter sample usingstandard fitting procedures [11]. The two power law expo-nents α and θ in HIP are distinct in meaning and function, θ defines memory decay over time, while α is determined bythe user distribution at large.Compared to existing models of data volume, HIP cap-tures the ongoing interactions of exogenous and endogenouseffects. Hence it is able to explain complex popularity serieswith multiple rises and falls (as shown in Figure 2). Helm-stetter and Sornette [20] fit the observed event rate after aninitial shock, and Crane and Sornette [13] produce a curvefit on the long-term approximation of the endogenous de-cay with no exogenous input. SpikeM [27] models volumesof events both prior and after a single considered shock,without accounting for external influences. The work mostrelated to ours on computing expectations over stochasticevent histories is th work of Farajtabar et al. [16], who mod-eled co-excitation on Twitter and computed the equivalentof ξ ( t ) on multivariate Hawkes process with exponential ker-nels, which admits a closed-form solution. In contrast, ourwork uses a univariate Hawkes process focused on modelingthe impact of Twitter on individual Youtube videos and apower law kernel. De et al. [14] further develop the workin [16] by combining a Markov process with a multivariateHawkes process for modeling opinion dynamics. We discuss key steps for estimating HIP from observedseries of views and external promotions over time.
Discretizing over time . We observe that behavioralstatistics are aggregated over fixed and discrete intervals –for YouTube, the public API provides the daily history ofthe number of views ¯ ξ [ t ] and number of shares ¯ s [ t ] for t =1 , . . . , T . Expressing HIP (Eq (3)) over discrete time gives: ξ [ t ] = µs [ t ] + C t X τ =1 ξ [ t − τ ]( τ + c ) − (1+ θ ) . (4) Here we use square brackets to denote discrete time, e.g. ξ [ t ],and round brackets to denote continuous time, e.g. ξ ( t ). Accounting for unobserved external influence.
Inaddition to the observed external promotions ¯ s [ t ] in tweetsor shares, we model the unobserved external excitation as aninitial shock (at t = 0) and a constant background excitation(for t > s [ t ] = γµ [ t = 0] + ηµ [ t >
0] + ¯ s [ t ] , (5)where ( arg ) is the standard impulse function – taking thevalue 1 when arg is true and 0 otherwise. In the absenceof a parametric model of generic external influence, the ini-tial impulse and the constant component require the leastamount of assumptions about how unobserved influence evolves.Here γ and η are additional parameters estimated from data.In our experiments, adding estimates for such unobserved in-fluence components improves the fitting for a large numberof videos. The loss function
For each video, we find an optimalset of model parameters ( µ, θ, C, c ) and of unobserved ex-ternal influence ( γ and η ). This is done by minimizing thesquare error between the observed viewcount series ¯ ξ [ t ] andthe model ξ [ t ], t = 1 : T . The corresponding optimizationproblem is as follows:min µ,θ,C,c,γ,η J = 12 T X t =0 (cid:0) ξ [ t ] − ¯ ξ [ t ] (cid:1) (6)We use L-BFGS [25] with analytical gradients and randomrestarts to minimize this non-linear loss function. Gradientcomputation is detailed in the appendix [1].Three example fits are shown in Figure 2. Visibly, theevent intensity model in Equation 3 links the exogenousand the endogenous effects of the social system, resultingin a tight fit between the model and the observed popular-ity history. For the Brazilian music video bUORBT9iFKc the memory kernel decays fast ( θ = 5 . WKJoBeeSWhc , the memory kerneldecays slowly ( θ = 0 . In this section, we examine the key property of HIP ofbeing a linear time-invariant system, which leads to twoimportant metrics for measuring two distinct aspects of avideo’s viral potential – the exogenous sensitivity and theendogenous response.
Exogenous sensitivity µ . As shown in Eq 3, the totalattention that a video receives consists of two parts: theinput from the exogenous stimuli, and the endogenous re-sponse corresponding to non-linear effects accumulated throughthe integral equation. The scaling parameter µ quantifies avideo’s sensitivity to external stimuli s ( t ). When µ → µ is large,each unit of external promotion leads to a large number ofnew views. HIP as an LTI system . We observe an important prop-erty of the HIP model. orollary
The HIP model, as defined in Eq (4)and (3) , is a linear time-invariant (LTI) system for t > . Being an LTI system [29] is to say that if ξ [ t ] is the eventintensity function for input s [ t ], then (for the same video)the event intensity function for a shifted and scaled versionof the input as [ t − t ] is aξ [ t − t ] for a > , t ≥
0, i.e.,scaled and shifted by the same amount.It is easy to see linearity holds by multiplying both sidesof Eq 3 by the same constant. For time invariance, changeof variable and then using the fact that ξ [ t ] = 0 when t < Impulse response function ˆ ξ [ t ]. One important de-scriptor of an LTI system is the impulse response function,the response to the unit impulse function [ t ], which takesthe value 1 when t = 0, and 0 otherwise. We define ˆ ξ [ t ]as the impulse response of the HIP model. It follows fromEq. (4) that ˆ ξ [ t ] is the solution to the following self-consistentequation: ˆ ξ [ t ] = [ t ] + C T X t =0 ˆ ξ [ t − τ ]( τ + c ) − (1+ θ ) , (7)For each video, ˆ ξ [ t ] completely characterizes the endoge-nous response of the HIP model: Lemma
Sliced Responses
The intensity function ξ [ t ] of HIP can be written as the sum of impulse responses,scaled and shifted by the corresponding external input. ξ [ t ] = T X τ =0 s [ τ ]ˆ ξ [ t − τ ] (8)To see that this is true, first notice that external input s [ t ]can be expressed as a sum of shifted and scaled impulses. s [ t ] = T X τ =0 s [ τ ] [ t − τ ] (9)Combining Eq (7) and (9) will lead to Eq (8). In other words,the total popularity at time T can be obtained as the sum ofthe unfolding through the endogenous reaction, of the exter-nal stimuli having occurred at times 1 , , . . . , T −
1. Fig 2(b)illustrates this property using a sliced and stacked popular-ity graph. The alternating white and gray slices are scaled(and shifted) versions of the impulse response represented inthe right inset. For each discrete time point t corresponds aslice, scaled by the external stimuli s ( t ), which adds to theslices constructed at previous times t < t . Adding all theseslices together recovers the overall intensity ξ ( t ) as in Eq 3(blue line), which tracks closely the long-term dynamics ofthe observed popularity (dashed line). The LTI propertyand its related quantities provides the mathematical groundto define our second important measure. Endogenous response A ˆ ξ . We define the total endoge-nous response generated from a single unit of exogenous ex-citation, computed as A ˆ ξ = P ∞ t =0 ˆ ξ [ t ]. In this work, wecompute A ˆ ξ by taking the sum over 10,000 time steps. A ˆ ξ is finite when the underlying HIP is so-called sub-critical .Other HIP-derived quantities, such as scaling parameter C or memory exponent θ could potentially serve to describevideo virality. We find, however, that despite being related,the non-linear interactions among HIP parameters renderthem inaccurate in explaining popularity compared to A ˆ ξ . Detailed discussions on the convergence criteria for A ˆ ξ , andvisualizations of other parameters are in the appendix [1].Together with exogenous sensitivity µ , this is the second keyquantity for measuring video virality. They will be used tocompare individual and collections of videos in Sec. 4.
3. THE TWEETED VIDEOS DATASET
A key component in linking the exogenous influence andthe endogenous response is to obtain data for the exogenouscomponent, preferably both inside and outside the studiedsocial network. We describe a new dataset across Twitterand Youtube networks, linked via the unique video ids, inwhich the volumes of tweets and Youtube shares serve asexogenous signals. We then introduce the popularity scale, amapping between the number of views (or shares, or tweets)and the percentile ranking of a video, which will be used forvisualizing popularity and for evaluating popularity forecast.
We collect a dataset of tweeted videos by streaming tweets(via Twitter API) published between 2014-05-29 and 2014-12-26 which mentions YouTube videos. This yields a largeand diverse set of over 81.9 million videos mentioned in 1.06billion tweets. We obtain from YouTube their video meta-data, including upload date, author and video category, aswell as the time series consisting of the daily number of viewsand shares. The video categories are a one-level YouTubeclassification of videos, example of such categories being
Mu-sic , Gaming or Film & Animation . Along with the dailynumber of tweets, we have three attention-related time se-ries for each video: ( views [ t ], shares [ t ] and tweets [ t ]), where t indexes time with the unit of a day.In order to study videos with non-trivial popularity andpromotion activities, we construct a subset, denoted as the Active dataset, by restricting to videos that are still onlineand that have their popularity and sharing series at least120 days long, since the upload and until the crawling date.Furthermore, we restrict the set of videos to those that re-ceived at least 100 tweets and 100 shares by the 120th day,in order to obtain videos twitted and shared enough to esti-mate the external influence on popularity. We also remove6 rare categories containing less than 1% videos (and theircorresponding videos). The
Active dataset contains 13,738videos across 14 categories and it is used in both explainingand forecasting popularity in Sec. 5. Reasons for the dras-tic dataset reduction from 81M to
Active include: videosuploaded earlier than 2014-05-29 (and hence without a com-plete tweet history), videos that are no longer online, thosedo not make viewcount history public, and the long-taileddistribution of tweets and shares – more than half of thevideos are tweeted only once. Note that when they exist,the popularity and the sharing series do not contain miss-ing data. A profile of the tweeted videos dataset and moredetails about its construction are given in the appendix [1].We use the first 90 days of each videos’ viewing and shar-ing/tweeting history to estimate the HIP parameters.
It is well-known that network measurements such as thenumber of views and shares follow a long-tailed distribution.We quantify video popularity on an explicit popularity per-centile scale, with 0 .
0% being the least popular, and 100%being the most popular. Fig. 3(a) and (b) show the popular- hares: Popularity scale at 60 days
Popularity percentile s ha r e s % % % % % % % % % % % % % % % % % % % % (a) views: Popularity scale at 60 days Popularity percentile v i e w s % % % % % % % % % % % % % % % % % % % % (b) Evolution of popularity percentiles
Popularity perc. at 60 days P opu l a r i t y pe r c . a t da ys % % % % % % % % % % % % % % % % % % % % (c)Figure 3: The popularity scale of YouTube videos, computed on the Active dataset. The total numbers of shares (a) and views (b) obtained by each video in the first 60 days after upload are divided into 40 equally spaced bins (i.e. each with 2 . .
5% most popular videos span more than one order ofmagnitude for both views and shares. Note that outliers in this bin are not represented, as the most popular videos in thecollection have ∼ views and ∼ shares. (c) Evolution of the views popularity between 30 (y-axis) and 60 (x-axis) days.Boxplots show where each 2.5% of videos at 60 days came from (in terms of percentile position at 30 days). The outliers arevideos that have improved their popularity significantly.ity scale as boxplots (in log-scale) over the Active dataset,after 60 days of video life for shares and views, respectively.The shape of the scale is similar for both shares and views,and it reflects their long tail distribution. The only notabledifference is the scale of the y-axis, as videos tend to accumu-late less shares than views. The popularity scale for tweetsis very similar to the one for shares, and shown in the ap-pendix [1]. Based on the shares and views popularity scales,we define two mapping functions S t ( x ) , P t ( x ) : R + −→ [0 , S t ( x ) or the number of views for P t ( x ) – and outputs thepercentile value on the corresponding popularity scale con-structed at time t .In Fig. 3(c) we explore the change of views popularityof each video from 30 days (y-axis) to 60 days (x-axis).Formally, we plot the relation between P (cid:0)P ¯ ξ [ t ] (cid:1) and P (cid:0)P ¯ ξ [ t ] (cid:1) , where ¯ ξ [ t ] is the number of views at time t (here the t -th day). Note that most videos retain a simi-lar rank (in the boxes along the 45 degree diagonal line), orhave a slight rank decrease as they are overtaken by othervideos (slightly above the diagonal in the plot). No outliersexist in the upper-left part of the graph, since a video cannotlose viewcount that it already gained. Most notably, we cansee that videos from any bin can jump to the top popularitybins between 30 and 60 days of age, such as the outliers forthe few boxes on the far right. This phenomenon elicits twoimportant questions: how did these videos go viral, and isthis phenomenon related to external promotions?
4. THE ENDO-EXO MAP
Using two quantities defined in Sec 2.5, we construct a2-dimensional map with endogenous response A ˆ ξ as the x-axis and exogenous sensitivity µ as the y-axis. We call thisplot the endo-exo map . This section presents example usesof this map for explaining video popularity, and identifyingvideos that are not promotable. Explaining popularity.
Intuitively, a video with a largeendogenous response A ˆ ξ and a high exogenous sensitivity µ has high potential to become viral. Specifically, each unit ofexogenous excitation will generate µA ˆ ξ events through theHawkes intensity process. On the endo-exo map, videos inclose proximity have similar potentials to become popular and the differences in their popularity would be due solelyto the difference in exogenous attention. Fig 4(a) illustratesthis phenomena using four videos. Videos v and v arevery similar in both A ˆ ξ and µ ; the fact that v has 4.61xmore views is explained by it receiving 3.22x more exoge-nous promotions. On the same map, v received a similaramount of promotion as v and their differences in popular-ity are explained by v being less endogenously responsive(smaller A ˆ ξ ) than v . Moreover, v has a similar endoge-nous response and sees similar amounts of promotion as v ;the difference between their popularities is explained by v being less exogenously sensitive, with a lower µ . The endo-exo map provides two distinct aspects from which a video’spopularity can be analyzed, which are detailed next. What describes the most popular videos?
Onemay wonder whether higher popularity can be attributedto higher exogenous sensitivity, higher endogenous responseor a combination of both. We examine a collection contain-ing diverse video categories and find that the explanationvaries. We draw on the endo-exo map all the videos thatbelong to the same category in the
Active dataset and wevisualize them as two-dimensional density plots. Fig. 4 (c)and (d) compares the plots for the videos in
Gaming and
Film& Animation , to that of the top 5% most popular videos inthese two categories, respectively. Visibly, while most pop-ular videos in
Film & Animation are described by higherexogenous sensitivity (shifting upwards), the most popular
Gaming videos have higher endogenous response – their den-sity mass is shifted to the right of the endo-exo map. Othercategories such as
Comedy or News & Politics (shown inthe appendix [1]) present two dense regions, one for higher A ˆ ξ and one for higher µ . These observations show that themost popular videos in different categories differ in terms ofthe two main factors that drive popularity. Identifying unpromotable videos.
The endo-exo mapcan be used to readily identify an interesting class of videos:the ones which are very difficult to promote. Given that thequantity µA ˆ ξ describes the number of views that one unit ofexternal promotion (via sharing or tweeting) will generateunder the joint influence of endo- and exo- factors – a verysmall µA ˆ ξ (e.g., µA ˆ ξ < e −
3) is a hallmark of a videobeing unpromotable . Fig. 4(b) contains a zoomed-out view c)(d)
Gaming: all Gaming: top 5%Film and animation: top 5% E x ogenou s s en s i t i v i t y µ Endogenous response A ξ Endogenous response A ξ E x ogenou s s en s i t i v i t y µ Endogenous response A ξ Film and animation: all − − − − v v v v (a) The endo-exo map(b) Unpromotable videos on the endo-exo map ●●● ●● ●●●●● ● ●● ●● ●●●●● ●● ● ●●●● ●●● ●● ●●● ● ●● ●●● ●● ●●● ●●● ●● ●●● ● ●● ●● ●● ●●●●● ●●●● ● ●● ●●●●●● ●● ●● ●● ●● ●● ●●● ●●●●● ● ● ●● ●●●●●● ●● ●●● ● ●● ● ●●● ●●● ●● ● ●●●● ● ●● ●●● ● ●●●●● ●●●●● ●●● ●●●● ● ●●● ●● ●●● ● ●●● ●● ● ●●●● ● ●●● ●● ●● ●●● ●● ●●● ●●●● ● ●●● ●● ●● ●●●● ● ● ●● ●● ●● ●●●● ●● ● ●● ●●●●●●●● ●●● ●● ●● ●● ●● ● ●● ●● ●●● ●● ●●●● ● ●●●● ●●● ●●●●● ●●● ●●● ●●● ●● ●● ●● ●●● ●●●● ●● ●●● ●● ● ●● ●● ●●● ● ●● ●●● ●● ● ●●●●● ● ●●●●●● ●●●●●● ●●●● ● ●● ●● ●● ● ●●● ● ●● ●●● ●●●● ●●● ● ● ●● ●●● ● ●● ●● ●●● ●● ● ● ● ●● ● ●● ●● ●●●● ●● ●● ●●●●●● ●●● ● ●● ●● ●● ●● ●● ●● ●●● ●● ● ●●●●●●●● ●● ●●●●●●● ●●● ●●● ●●● ● ●●●●●●● ● ●●● ●●● ● ●● ●●●● ● ●● ●● ● ● ● ●●● ●●●● ● ●● ●● ● ● ● ●●●●● ●● ● ●●●● ●●● ●● ●● ●● ●● ●●● ●● ●● ●●●● ●●●● ● ●●● ●● ● ●● ●● ●●● ●● ●● ● ●●● ●● ●●● ● ●● ●●●●●●●●●● ● ●● ● ●● ● ●●●● ●● ●● ●●●● ●●● ●● ● ● ●●● ● ●● ● ●● ●●●● ●● ●● ●● ●●● ●●● ●● ●● ●● ●● ●●● ●● ●● ● ●●● ●●● ●● ●●● ●● ● ● ●●● ●●●● ●● ●● ●● ●●●●●●●● ●●● ●● ●● ●●●●● ●●●● ●● ●●● ● ●● ●● ● ●●● ●● ●●● ●●●●●● ●● ●● ●● ● ●● ● ●●● ●●● ●● ●● ●●● ●●● ●● ●●●● ●● ●●● ●●●●●● ● ●● ● ●● ●●●● ●● ● ●● ●● ● ●●●●●●●●●● ●● ●●●●● ●●● ●● ●●● ●● ●●●● ●●● ●●● ● ●●●●● ● ● ●● ●● ●● ●●●●●● ●●●● ●● ●● ●● ●● ●● ●● ● ●● ● ●● ●●●● ●● ●● ● ●● ●●● ●● ●●● ● ●● ●●● ● ● ● ●● ● ●● ● ●● ● ●● ●● ● ●●● ●● ● ●● ●● ●● ●● ●●●● ●● ● ●● ●● ●● ●●● ●●●● ● ●● ● ●●● ●●● ●●●●● ●● ●●● ●●● ●●●● ●●●● ● ●● ●●● ●●● ● ●● ●●●● ●●●●●●● ●●●●● ● ●● ● ● ●● ●● ●● ●●● ●● ●●● ●●● ●●● ●●●●●●●● ●●●● ● ●● ● ●●●● ●●●● ●●● ●● ●● ●●●● ●● ● ●● ● ●●● ● ●● ●● ● ● ●● ●●●●● ● ●●● ●● ●● ●● ● ●●●●● ●● ●● ● ●● ● ●● ● ●●● ● ●●● ●●● ●● ● ●● ●● ●● ●●● ●● ● ●●●● ●●●● ● ● ●● ●●● ● ●●● ●● ● ●●● ●● ●● ●● ●● ●●●●●●●●●●●●● ●●● ●●●● ●●● ●●● ● ● ● ●●●● ● ●● ●●● ●●●● ●●●●●●● ●●●● ● ● ● ● ●●● ●●●● ● ●●●● ● ●● ●●● ● ●● ●●●●● ●●● ●●●● ● ●● ● ●●●● ● ●●● ●●● ●●● ● ●● ●●●● ●●●● ●●● ●● ●●● ●● ●●● ●●●●●● ●●● ●● ●● ● ●● ●●●●●●●● ● ●● ●●●●●●●● ●● ●●●● ● ●● ●● ●● ●● ● ●● ● ●●● ●●● ● ●●● ● ●●●● ●● ●● ●●● ●●●●● ●●●●● ●●●● ● ●● ●●● ●●● ●●● ●● ●●● ●●● ●● ●●● ●● ● ●●● ●● ●● ● ●● ●●● ●● ● ●● ●●●●● ●●● ● ●● ●●● ● ● ●● ● ●● ●● ●● ●● ●● ● ●●●● ● ●●● ●●● ●● ●● ●● ●●●● ●● ●● ● ●●● ● ●● ●● ●●●●● ●●●●●● ●● ●●● ● ●● ●●●● ● ● ●●●●●● ● ●●●● ●●●●● ● ●●● ●●●● ● ●●●● ● ●● ●● ●●● ●●● ●●●●● ● ●●● ● ● ●●●● ● ●● ●● ●● ●●● ● ●● ●● ● ●● ●● ●●●●● ●● ●● ● ●● ●●●● ●● ●● ● ●● ● ●● ●●● ●● ●●● ●● ●●● ●●●● ●●● ● ●● ●● ●● ●● ●●● ● ●● ●● ●● ● ●●●
YID: -5 -15 -10 Endogenous response A ξ Endogenous response A ξ E x ogenou s s en s i t i v i t y µ E x ogenou s s en s i t i v i t y µ Endogenous response A ξ Figure 4: Visualizing video virality and video popularity using the endo-exo map . (a) Four example videos on the endo-exomap. X-axis A ˆ ξ : the magnitude of endogenous reaction; Y-axis µ : sensitivity to exogenous stimuli. The radius of each circleis proportional to the popularity percentile P t ( · ) of each video after t = 120 days, with values between 0.0 (least popular)and 1.0 (most popular). The color represents the amount (percentile) of total shares received, denoted as S t ( · ), with valuesbetween 0.0 (no promotion) and 1.0 (receiving the most promotions). v and v present similar endogenous reaction andexogenous sensitivity, being at the same position on the endo-exo map. The difference in their popularity (size) is explainedby the fact that v received 3.22 times more promotions than v . Both v and v receive similar amounts of promotion (color)as v , but they achieve lower popularity (smaller size) due to their less privileged position on the endo-exo map: v is lesssensitive to external stimuli than v and v , while v has a smaller endogenous reaction than v and v . Information aboutthe four example videos are as follows, with their popularity percentile P and shares percentile S : v is a short Gaming video, YoutubeID , P (634 ,
370 views) = 85%, S (351 shares = 65%; v is a collection of “ALS ice bucketchallenge” videos, YoutubeID , P (137 , , S (109) = 10%; v is a funny science video, explainingtypes of infinity in math, YoutubeID , P (193 , , S (356) = 65%; v is from a Portuguese youtuber,YoutubeID , P (93 , , S (311) = 60%. (b) A zoomed-out scatter plot of the endo-exo map ofthe videos in the
People & Blogs category. The shaded portion of this map consists of videos with low values of totalresponse µA ˆ ξ < − and hence dubbed unpromotable videos. Thumbnail of an example video is included,with µ = 2 . × − and A ˆ ξ = 1. (c) Density plot for all (left) vs the most popular 5% (right) Film & Animation videos. (d)
Density plot for all (left) vs the most popular 5% (right) Gaming videos. Popular
Film and Animation videos tend tohave a higher exogenous sensitivity, while those for
Gaming have mainly a higher endogenous response.of the endo-exo map associated with the category
People& Blogs . We found 63 videos ( ∼ . ∼ . Active set are deemed unpromotable. The thumbnail ofone example video (a teenager video blog) is shown. It has µ = 2 . × − and A ˆ ξ = 1, hence each online promotionis expected to generate 0 views. In contrast, for video v inFig. 4(a), each promotion is expected to generate 598 views.
5. FORECASTING POPULARITY GROWTH
Via the endo-exo map, the Hawkes intensity process pre-scribes a video’s expected popularity dynamics under exter-nal promotions. This section explores the predictive powerof such a model. We first illustrate the setting for popu-larity forecasts using video examples, and then present a quantitative evaluation.
We use HIP to identify videos that are not already popularbut have a high potential to become so. This is similar to thephenomenon of delayed recognition in science [21]. Note thatthis approach is predictive in that we aim to find such poten-tially viral items before they become popular, rather thana measurement-driven approach that analyzes viral items inpast history. Video in Fig. 5(a) is such anexample, it received 15,687 views after being online for 90days. The HIP model deems it to have a high endogenousresponse ( A ˆ ξ = 6 . × ) and a high exogenous sensitivity( µ = 119 . . . . . M ean ab s o l u t e pe r c en t il e e rr o r HIP( (c) Comparison of popularity forecasts v i e w s − −
12 2014 − −
09 2014 − −
07 2014 − −
04 2014 − − s ha r e s Observed (b) Forecasting popularity for video 0qMJ_zhat_E μ = 78.352, θ = 0.066c = 0.174, C = 0.167 (a) An example video with high virality − −
07 2014 − −
04 2014 − −
01 2014 − −
29 2014 − − Observed s ha r e s µ = 119.02 A ξ = 6.94 x v i e w s HIP(
Figure 5: Popularity forecasting using the Hawkes intensity process. (a) Popularity series for video , explaininga brain disorder. The video receives a total of 36 shares and 15,687 views in the first 90 days (see inset), it is estimated tohave a high exogenous sensitivity and a high endogenous response ( µ = 119 . A ˆ ξ = 6 . × ). Between day 91 and day120, this video jumped from a popularity percentile of 5.85% to 94.9%, receiving 229 shares and gaining 2.42 million views.(b) Forecasting popularity for video . Black dotted line: viewcounts series from day 1 to 120 after video upload.Red line: exogenous simuli s ( t ), also used in parameters estimation. Left of the the gray dashed vertical line at T ≤
90 days:time period used for parameters estimation. Blue line: fitted viewcounts for T ≤
90 days, generated using Eq 3. Magenta line:viewcount forecast for day 91 to 120. (c) Comparison of average forecasting errors on the
Active set. y-axis: Forecastingerrors, calculated as the absolute difference between the popularity percentile at day 120 and that forecasted by each approach.x-axis, left to right: Hawkes intensity model, using either or as s ( t ); multivariate linear regression (MLR),using only popularity history, or and , respectively.of shares during its first 90 days. Consequently, the videogained 2.42 million views, drastically improving its rankingon the popularity percentile scale from 5.85% to 94.9%. The HIP model takes as input the exogenous promotion s [ t ] to produce estimates of the viewcount ξ [ t ]. To construct ξ [ t ] in the future, s [ t ] needs to be either estimable or known.We call this forecasting popularity, as opposed to predict-ing popularity where no information about future exoge-nous stimuli is assumed. Forecasting popularity has broadapplications, such as estimating the effect of intended (pro-motional) interventions, and making decisions about whento promote. Evaluating popularity forecast on temporal hold-out data.
We design a protocol to quantitatively evaluatethe predictive power of HIP. We use historical data held-outover time, thus avoiding the practical difficulty of generat-ing realistic promotions and responses in a large-scale socialnetwork. Using (known) exogenous promotion s [ t ], we fore-cast the popularity P ξ [ t ] during the evaluation period(in purple) using Eq (4). Fig. 5(b) illustrates this settingwith an example music video. A vertical line divides theobservation period, day 1 to 90, and the evaluation period,day 91 to 120. The viewcount and the sharing history inthe observation period is used to fit model parameters andexplain observed popularity (in blue). For this example, theforecast and the actual views are fairly similar. Percentile-error metric.
We obtain a predicted totalviewcount over the evaluation period, i.e, P ξ [ t ], and weevaluate the performances by comparing it to the actual to-tal viewcount P ¯ ξ [ t ]. Commonly-used performance errormetrics, such as root-mean-square-error (RMSE) or the nor-malized RMSE, are skewed by the large number of outliersin a long-tailed viewcount distribution and we chose not touse them. Instead, we map the forecasted number views toa popularity scale constructed as shown in Sec. 3.2, on theperiod 91-120 days of video life. We normalize the numberof views into a metric between 0 and 1 and we computethe absolute error of the predicted percentile. When com-pared to the error metrics based on the difference in views(like RMSE), this metric focuses on ranking videos correctlywith respect to a large collection and is as useful as the broad class of learning to rank applications. Baseline algorithms.
The state-of-the-art approach forpopularity prediction uses multivariate linear regression (MLR),based on the observation that historic viewcounts are pre-dictive for future viewcounts [30, 34]. We train linear re-gressors to predict daily viewcounts for each day between91 and 120, using a 90-dimensional feature corresponding tothe number of views in days 1 to 90. To give the MLR fore-cast the same amount of information as the HIP model, webuild two enhanced baselines, denoted by MLR (
Active and we obtain predictions for each videousing cross-validation.
Fig. 5(c) summarizes forecasting performance for HIP andthe MLR baselines. The forecasts made using HIP havelower average error compared to the linear regression withor without exogenous stimuli ( .
96% (me-dian 3%) for HIP ( .
94% (median 3 . p < . d [12]. Within the HIP variants,we found that using the number of shares generates slightlybetter forecast than the number of tweets, but the differ-ences are not statistically significant at p = 0 .
001 (more de-tails about effect sizes and statistical tests can be found inthe online appendix [1]). We speculate that the difference inforecasting performance is due to the nature of the sources ofexogenous excitation: shares capture the promotion behav-ior via a multitude of environments, whereas tweets countthe volume of promotion in Twitter only.We also observe that the performance gap doubles whenforecasting popularity on more difficult videos – videos witha large exogenous shock in the forecasting period, definedas the mean number plus 100 times the standard devia-tion of the number of shares during the observed period.Fig. 5(a) shows an example of such a video. There are 4006uch videos in the
Active dataset, for which HIP ( .
11% (median 3 . .
24% (me-dian 6 . causal in a linear system sense [29]in that future tweets cannot change past views, but does notdirectly correspond to the causal inference paradigm aboutwhether a control variable will change a response variablein the presence of other confounding factors. Nonetheless,we conducted statistical tests using the well-known GrangerCausality [18] on the shares and view series (details in theappendix [1]); they do not show consistent results for eithershares influencing views or vice versa.
6. RELATED WORK
Popularity modeling and prediction.
Early measure-ment studies linked popularity with user influence in Twit-ter [7, 36] and with the speed and spread of information insocial networks [8]. More recently, generative methods, usu-ally based on point-processes, were introduced for popularitymodeling [13, 15, 38] and prediction [4, 28]. In their seminalwork, Crane and Sornette [13] showed how a Hawkes point-process can account for popularity bursts and decays. Sub-sequently, more sophisticated models have been proposedto model and simulate popularity in microblogs [38] andvideos [15], by accounting for phenomena such as the “rich-get-richer”phenomenon and social contagion. Shen et al. [33]employ reinforced Poisson processes, modeling three phe-nomena: fitness of an item, a temporal relaxation functionand a reinforcement mechanism. Zhao et al. [39] proposeSEISMIC, which employs a double stochastic process, oneaccounting for infectiousness and the other one for the ar-rival time of events. TiDeH [23] is an extension of SEIS-MIC, which aims at estimating future number of views as afunction of time, instead of just the final total cascade size.HIP differs from the above applications in two fundamen-tal ways. First, most of the models [4, 23, 28, 31, 39] dealwith single diffusion cascades, that is the reaction to singleshocks. HIP models popularity as a continuous endogenous-exogenous intertwining, allowing it to closely fit complexevolutions. Second, typical point-process based methods re-quire to observe each individual event during the trainingperiod, whereas HIP models volumes of attention directly.
Modeling volumes of popularity.
A number of mod-els have been proposed to describe the shape and evolutionof the volume of social media activity over time. The semi-nal meme-tracker [24] system uses a curve with polynomialincrease followed by exponential decay to describe sawtooth-shaped volume of news mentions. The SpikeM [27] systemuses a fixed memory component, modulated by a periodiccomponent, however it does not explicitly account for exter-nal influence. Most recently, Tsytsarau et al. [35] model thepopularity volume as the convolutions two sequences, newsevent importance and media response, which are assumedto have predefined shapes. Yang et al. [37] propose a gen-erative model to describe sequences that have multiple pro-gression stages along with algorithms to estimate model pa-rameters and to segment existing sequences. Being based aself-excited Hawkes process, HIP simultaneously addresses a series of shortcomings of the above approaches: it is adaptedto forecast total popularity, it can recover all parametersfrom data, and it explains additional, non-stationary varia-tions from linked data sources of external activities.
Influence estimation and maximization are some-what related research problems, but distinct from the oneapproached in this paper. Influence estimation [17] aims tolearn probabilities of influence between pairs of users, start-ing from a social graph and a log of actions of its users.Influence maximization [16, 22, 32] finds the subset of userswho, if convinced to promote a piece of content, would max-imize its diffusion. The main difference between this lineof work and HIP is that we measure the volume of promo-tion and use it to forecast popularity, rather than takinga graph-centric view based on network structure and userinteractions.
7. SUMMARY AND DISCUSSION
This research establishes a novel mathematical model tosystematically link the endogenous response to the exoge-nous stimuli of a social system. The model developed hereprovides a nuanced view of the continued interactions of en-dogenous and exogenous effects that generate complex andmulti-phased popularity dynamics over time. We validatethe model on the popularity and promotion history of a largeset of YouTube videos. We quantify the endogenous viralityand exogenous sensitivity for each video, and we them toexplain the properties of the most popular videos, as wellas to identify videos that will respond well to promotionsand those that will not. Such detailed analysis is possi-ble because the aggregated attention and promotion dataare available from YouTube or inferred from public sourcessuch as Twitter. Note however that HIP does not makeany platform-dependent assumption and that it can functionwith any popularity and promotion series generated by ag-gregated human behavior. We envision that the same kindof attention dynamics would hold for other content types,such as webpage views, podcasts, or blogs.There are a number of simplifying assumptions and limi-tations of the proposed model, which can become fruitful di-rections of further investigation. The Hawkes intensity pro-cess captures popularity dynamics that are reflected only inthe observed external promotion series, and does not captureother factors such as (daily or weekly) seasonality. What thismodel also focuses on is the expected influence over all usersrather than individual influence. Both of these observationssuggest extensions that could incorporate seasonality com-ponents as well as taking into account individual influences.Lastly, media items are influenced by a variety of sourcesin the open world and there are many sources of online oroffline promotion that are unobserved or difficult to obtaindata from. A well-known example is that gaming videos areknown to be discussed intensively in topic-specific forums.Tracking and estimating diverse or even unknown sources ofexogenous influence is another open research question.
Acknowledgments.
This material is based on research spon-sored by the Air Force Research Laboratory, under agreementnumber FA2386-15-1-4018. We thank the National Computa-tional Infrastructure (NCI) for providing computational resources,supported by the Australian Government. We thank Alban Grastien,Richard Nock and Christian Walder for insightful discussions. . REFERENCES [1] Appendix: Expecting to be HIP: Hawkes intensityprocesses for social media popularity, 2017.https://arxiv.org/pdf/1602.06033.pdf
WSDM ’11 , page 65, feb 2011.[3] R. Bandari, S. Asur, and B. A. Huberman. The pulse ofnews in social media: Forecasting popularity. In
SixthInternational AAAI Conference on Weblogs and SocialMedia , 2012.[4] P. Bao, H.-W. Shen, X. Jin, and X.-Q. Cheng. Modelingand Predicting Popularity Dynamics of Microblogs usingSelf-Excited Hawkes Processes. In
WWW , pages 9–10, 2015.[5] J. Berger and K. L. Milkman. What makes online contentviral?
Journal of marketing research , 49(2):192–205, 2012.[6] J. Berger and E. M. Schwartz. What drives immediate andongoing word of mouth?
Journal of Marketing Research ,48(5):869–880, 2011.[7] M. Cha, H. Haddadi, F. Benevenuto, and K. P. Gummadi.Measuring User Influence in Twitter: The Million FollowerFallacy. In
ICWSM ’10 , volume 10, pages 10–17, 2010.[8] M. Cha, A. Mislove, and K. P. Gummadi. Ameasurement-driven analysis of information propagation inthe flickr social network. In
WWW , pages 721–730, 2009.[9] J. Cheng, L. Adamic, P. A. Dow, J. M. Kleinberg, andJ. Leskovec. Can cascades be predicted? In
WWW ’14 ,pages 925–936. ACM, 2014.[10] J. Cheng, L. A. Adamic, J. M. Kleinberg, and J. Leskovec.Do cascades recur? In
WWW , pages 671–681, 2016.[11] A. Clauset, C. R. Shalizi, and M. E. J. Newman.Power-Law Distributions in Empirical Data.
SIAM Review ,51(4):661–703, Nov. 2009.[12] J. Cohen.
Statistical Power Analysis for the BehavioralSciences . Hillsdale, NJ, 2nd edition, 1988.[13] R. Crane and D. Sornette. Robust dynamic classes revealedby measuring the response function of a social system.
PNAS ’08 , 105(41):15649–15653, oct 2008.[14] A. De, I. Valera, N. Ganguly, S. Bhattacharya, and M. G.Rodriguez. Learning and forecasting opinion dynamics insocial networks. In
NIPS’16 , pages 397–405, 2016.[15] W. Ding, Y. Shang, L. Guo, X. Hu, R. Yan, and T. He.Video Popularity Prediction by Sentiment Propagation viaImplicit Network. In
CIKM ’15 , pages 1621–1630, oct 2015.[16] M. Farajtabar, N. Du, M. G. Rodriguez, I. Valera, H. Zha,and L. Song. Shaping social activity by incentivizing users.In
NIPS’14 , pages 2474–2482, 2014.[17] A. Goyal, F. Bonchi, and L. V. Lakshmanan. Learninginfluence probabilities in social networks. In
WSDM ’10 ,pages 241–250. ACM, 2010.[18] C. W. Granger. Some recent development in a concept ofcausality.
Journal of Econom. , 39(1):199–211, 1988.[19] A. G. Hawkes. Spectra of some self-exciting and mutuallyexciting point processes.
Biometrika , 58(1):83–90, 1971.[20] A. Helmstetter and D. Sornette. Subcritical andsupercritical regimes in epidemic models of earthquakeaftershocks.
Journal of Geophysical Research: Solid Earth ,107(B10):ESE 10–1–ESE 10–21, 2002. [21] Q. Ke, E. Ferrara, F. Radicchi, and A. Flammini. Definingand identifying sleeping beauties in science.
PNAS ,112(24):7426–7431, 2015.[22] D. Kempe, J. Kleinberg, and ´E. Tardos. Maximizing thespread of influence through a social network. In
KDD ’03 ,pages 137–146. ACM, 2003.[23] R. Kobayashi and R. Lambiotte. TiDeH: Time-DependentHawkes Process for Predicting Retweet Dynamics. In
ICWSM 2016 , number ICWSM, 2016.[24] J. Leskovec, L. Backstrom, and J. Kleinberg.Meme-tracking and the dynamics of the news cycle. In
KDD ’09 , pages 497–506. ACM, 2009.[25] D. C. Liu and J. Nocedal. On the limited memory BFGSmethod for large scale optimization.
MathematicalProgramming , 45(1-3):503–528, aug 1989.[26] T. Martin, J. M. Hofman, A. Sharma, A. Anderson, andD. J. Watts. Exploring limits to prediction in complexsocial systems. In
WWW ’16 , pages 683–694, 2016.[27] Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, andC. Faloutsos. Rise and fall patterns of informationdiffusion: Model and implications. KDD ’12, 2012.[28] S. Mishra, M.-A. Rizoiu, and L. Xie. Feature Driven andPoint Process Approaches for Popularity Prediction. In
CIKM ’16 , page 10, 2016.[29] A. Oppenheim, A. Willsky, and S. Nawab.
Signals andSystems . Prentice Hall, 1997.[30] H. Pinto, J. M. Almeida, and M. A. Gon¸calves. Using earlyview patterns to predict the popularity of youtube videos.In
WSDM ’13 , pages 365–374. ACM, 2013.[31] J. C. L. Pinto, T. Chahed, and E. Altman. Trend detectionin social networks using hawkes processes. In
ASONAM ’15 , pages 1441–1448, 2015.[32] V. Raghavan, G. Ver Steeg, A. Galstyan, and A. G.Tartakovsky. Modeling Temporal Activity Patterns inDynamic Social Networks.
IEEE Transactions onComputational Social Systems , 1(1):89–107, mar 2014.[33] H.-W. Shen, D. Wang, C. Song, and A.-L. Barab´asi.Modeling and Predicting Popularity Dynamics viaReinforced Poisson Processes. In
AAAI , page 291, 2014.[34] G. Szabo and B. A. Huberman. Predicting the popularityof online content.
Com. of the ACM , 53(8):80–88, 2010.[35] M. Tsytsarau, T. Palpanas, and M. Castellanos. Dynamicsof news events and social media reaction. KDD ’14, 2014.[36] J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank:finding topic-sensitive influential twitterers. In
WSDM ’10 ,pages 261–270. ACM, 2010.[37] J. Yang, J. McAuley, J. Leskovec, P. LePendu, andN. Shah. Finding progression stages in time-evolving eventsequences. In
WWW ’14 , pages 783–794, 2014.[38] L. Yu, P. Cui, F. Wang, C. Song, and S. Yang. Uncoveringand predicting the dynamic process of information cascadeswith survival model.
Know. and Inf. Syst. , page 10, 2016.[39] Q. Zhao, M. A. Erdogdu, H. Y. He, A. Rajaraman, andJ. Leskovec. SEISMIC: A Self-Exciting Point ProcessModel for Predicting Tweet Popularity. In
KDD ’15 , 2015. ppendix:
Expecting to be HIP:Hawkes Intensity Processes for Social Media Popularity
Marian-Andrei Rizoiu, Lexing Xie, Scott Sanner,Manuel Cebrian, Honglin Yu, Pascal Van HentenryckDOI: 10.1145/3038912.3052650
Contents ξ ( t ) for unobservedpoint processes . . . . . . . . . . . . . . . . 121.2.1 Preliminaries: event rate, countingprocess . . . . . . . . . . . . . . . . 121.2.2 Expected event rate for unmarked Hawkes . . . . . . . . . . . . . . . . 121.2.3 Expected event rate for marked
Hawkes 141.3 Branching factor and endogenous response 151.4 HIP as an LTI system . . . . . . . . . . . . 152 Details about fitting HIP . . . . . . . . . . . . . 162.1 The loss function . . . . . . . . . . . . . . . 162.2 Computing gradients . . . . . . . . . . . . 172.3 Adding an L regularizer . . . . . . . . . . 172.4 Properties of the model estimates . . . . . 183 Data . . . . . . . . . . . . . . . . . . . . . . . . 183.1 The and Active datasets . . . . . . 183.2 The popularity scale over time . . . . . . . 194 Understanding popularity dynamics . . . . . . . 194.1 Behavior across groups of videos: categoriesand channels . . . . . . . . . . . . . . . . . 204.2 Categories of longer versus shorter memory 214.3 Model parameters and popularity . . . . . 224.4 Potential causal connection between theviews, tweets and shares series . . . . . . . 235 Popularity forecasting and comparison to baseline 245.1 Additional results . . . . . . . . . . . . . . 245.2 Comparing performance . . . . . . . . . . . 245.3 Forecasting performance on difficult videos 26
Given time t ∈ [0 , ∞ ), we denote by λ ( t ) the event rate of an online resource at time t . The goal of this section isto derive the expected event rate , denoted as ξ ( t ), as theaverage response rate from a large network.There are two sources of events in the social system– exogenous events originating outside the system and endogenous events spawned from within the system as theresponse to previous events (that are either exogenous or endogenous). For example, a public speech held bya famous politician can be an exogenous source for thenumber of views of relevant Youtube videos on politics; onthe other hand, the views on trailers prior to the release ofnew movies exhibits a rich-get-richer effect for attentiondistribution that are characteristic of endogenous word-of-mouth diffusion. The Hawkes process [5], as defined in main text Eq. (1), isa non-homogeneous Poisson process with self-excitation,its event rate λ ( t ), or instantaneous conditional intensity r ( t |H t ) is: λ ( t ) := r ( t |H t ) = µs ( t ) + X t i
016 and we use it throughout theexperiments. As noted in the main text, the two powerlaw exponents are distinct in meaning and function, θ defines memory decay over time, while α is determined bythe user distribution at large. α is estimated from a largeTwitter sample. θ and other video-dependent parametersare estimated from popularity history as detailed in Sec 2below. ξ ( t ) for unob-served point processes In this section we give the proof of the main text The-orem 2.1. More precisely, we derive the expected eventrate ξ ( t ) over time, specified in the main text Eq. (3).This is done in three steps: we first include a prelimi-nary description of the event rate λ ( t ) in terms of the un-derlying counting process over infinitesimal intervals, wethen derive the expected event rate for unmarked Hawkesprocesses, and finally we build upon these to derive theexpected event rate for marked
Hawkes processes.
It is well known in stochastic process literature [8] that theevent rate λ ( t ), or the conditional intensity specification r ( t |H t ) of a point process is completely characterized bythe corresponding counting process N ( t ). Here N ( t ) isthe total number of events observed between time 0 and t . Given an infinitesimal interval δ at time t , the relation-ship between N ( t ) and r ( t |H t ) is described as: P ( N ( t + δ ) − N ( t ) = 1 |H t ) = r ( t |H t ) δ + o ( δ ) , P ( N ( t + δ ) − N ( t ) > |H t ) = o ( δ ) , with lim δ ↓ o ( δ ) δ = 0 . (13)Here P denotes the probability of a discrete random vari-able. The intuition of the expression above is that r ( t |H t )is proportional to the probability that N ( t ) increments by1, and that it is “very unlikely” for N ( t ) to increment bymore than one.Let dN t be the counting increment N ( t + δ ) − N ( t ) as δ ↓
0. From Eq. (13), we can describe dN t as a Bernoullirandom variable, with: P ( dN t = 1 |H t ) = r ( t |H t ) δ , P ( dN t = 0 |H t ) = 1 − r ( t |H t ) δ , for δ ↓ . It follows from the above that E dN t |H t [ dN t ] = r ( t |H t ) δ, for δ ↓ . Using the shorthand λ ( t ) for event rate and putting theabove together, we can see that Hawkes processes can bespecified as: λ ( t ) := r ( t |H t ) = lim δ ↓ P ( N ( t + δ ) − N ( t ) = 1 |H t ) δ = lim δ ↓ P ( dN t = 1 |H t ) δ = lim δ ↓ E dN t |H t [ dN t ] δ , (14)Note that Eq. (14) is an alternate formulation of Eq. (11)through the counting process N ( t ). Eq. (14) holds forall non-homogeneous Poisson processes. Hawkes pro-cesses (marked and unmarked) are special cases of non-homogeneous Poisson processes. unmarked Hawkes
We first study the simpler case of an unmarked
Hawkesprocesses λ u ( t ), and derive its expected event rate ξ u ( t )over possible event histories. While it is not strictly nec-essary to breakdown the derivation into two parts, thishelps illustrate the main ideas underlying the derivationfor marked processes in the next subsection. The key ideain this subsection is converting the conditional expecta-tion of event history into increments of the counting pro-cess, and using conditional expectations to link the expec-tations of counting increments to the expected rate ξ u ( t )via λ u ( t ). The next subsection will use exactly the sametreatment for the history of event times, and performs asimilar treatment for a history of event magnitudes.Let an unmarked Hawkes process be: λ u ( t ) := r ( t |H t ) = µs ( t ) + X t i 0, we have: ξ u ( t ) := E H t [ λ u ( t )] t = Kδ = µs ( t ) + lim δ ↓ K X k =1 δξ u ( kδ ) φ ( t − kδ )= µs ( t ) + Z t ξ u ( τ ) φ ( t − τ ) dτ (20)13erforming a change of variable τ ← t − τ , we obtain theintegral equation specifying the expected event rate forunmarked Hawkes process. ξ u ( t ) = µs ( t ) + Z t ξ u ( t − τ ) φ ( τ ) dτ (21)To the best of our knowledge, this definition of the in-tensity function, along with the derivation of its analyticalform is new. The original paper by Hawkes [5] presentsan integral equation of similar form, but it is for the co-variance density and not the event intensity function. marked Hawkes The expected event rate function ξ ( t ) for a marked Hawkesprocess is defined as the expectation of the event ratefunction λ ( t ) over the set of event times and magnitudesbefore time t . In this subsection we work with the eventrate as specified in Eq. (11): λ ( t ) := r ( t |H t ) = µs ( t ) + X t i 0, once the event time t i is determined.That is to say, for an event spawned through the endoge-nous process, the magnitude of the event is independentof the magnitude of its parent event.We define the expected event rate ξ ( t ) for the marked Hawkes processes λ ( t ) as follows. Step (22a) below is dueto µs ( t ) being non-random. ξ ( t ) := E H t [ λ ( t )]= E H t " µs ( t ) + X t i 0, we have: ξ ( t ) := E H t [ λ ( t )] t = Kδ = µs ( t ) + lim δ ↓ K X k =1 C · δξ ( kδ ) φ ( t − kδ )= µs ( t ) + C Z t ξ ( τ ) φ ( t − τ ) dτ τ ← t − τ = µs ( t ) + C Z t ξ ( t − τ ) φ ( τ ) dτ φ ( τ )=ˆ τ − (1+ θ ) = µs ( t ) + C Z t ξ ( t − τ )ˆ τ − (1+ θ ) dτ (30)Eq. (30) is Eq. (3) in the main text Theorem 2.1. We derive two quantities from the Hawkes Intensity Pro-cess in order to better visualize and explain the diversebehavior of video popularity. Branching factor n The first key parameter is thebranching factor n , defined as the mean number of daugh-ter events generated by a mother event. For a markedHawkes point process, the branching factor is computedby integrating the triggering kernel over time and takingthe expectation over the magnitude m . n = Z ∞ m min Z ∞ p ( m ) φ m ( τ ) dτ dm == Cθc θ , for β < α − θ > .n < subcritical regime , i.e., the instantaneousrate of events decreases over time and new events willeventually cease to occur (in probability); n > supercritical regime , i.e. each new event generates morethan one direct descendant, which in turn generates moredescendants, unless the network condition changes, thetotal number of events is expected to be infinity. Endogenous response A ˆ ξ The second quantity A ˆ ξ ,as defined in the main text Sec. 2.5, is the total numberof (direct and indirect) descendants generated from oneevent. In the main text, A ˆ ξ is defined in the discrete form,however it can also be defined as an integral over the con-tinuous form of the impulse response as A ˆ ξ = R ∞ ˆ ξ ( t ) dt .Although defined separately, we can see that A ˆ ξ isclosely related to branching factor n : the initial exoge-nous event will generate n events as first-generation di-rect descendants. Each of these events will generate anexpected n events ( n events in the second generation),and each of these will in turn generate n events ( n eventsin the third generation), . . . . Here n k is the average num-ber of events in the k th generation, and so on. This leadsto an equivalent definition of A ˆ ξ . A ˆ ξ = 1 + n + n + . . . + n k + . . . == lim k →∞ − n k − n == ( − n , n < ∞ , n > A ˆ ξ and n emphasize different in-tuitions. We chose to visualize A ˆ ξ in the endo-exo map,because it has a direct correspondence to the sliced LTIsystem view in main text Eq. (7) and Fig. 2(b), andthat A ˆ ξ has better numerical resolution for the more viralvideos – i.e., when n is close to 1. In the main text, weobtain estimates of A ˆ ξ by numerically summing ˆ ξ [ t ] in themain text Eq. (7) over 10 , 000 discrete time steps.Crane and Sornette [3] showed that the Hawkes Inten-sity Process in a super-critical state could explain somerising patterns of popularity observed in social media. Wenote, however, that finite resources in the real world, suchas collective human attention [10], are bound to be ex-hausted and online systems cannot stay indefinitely in asupercritical regime. We argue, most online media itemsare affected by a continued interaction of exogenous stim-uli and endogenous reaction (that may be sub- or super-critical), leading to continued rise in popularity, or mul-tiple phases of rising and falling patterns. Proof of main text Corollary 2.2 The Hawkes in-tensity process can be viewed as a system with one input– the exogenous stimuli rate s ( t ), and one output – theevent rate ξ ( t ). The main text Corollary 2.2 states thatthe system s ( t ) → ξ ( t ) is an Linear Time Invariant (LTI)system. That is to say, the system has two properties: Linearity , which states that the relation between theinput and the output of the system is a linear map: if s ( t ) → ξ ( t ) and s ( t ) → ξ ( t ), then as ( t ) + bs ( t ) → aξ ( t ) + bξ ( t ) , ∀ a, b ∈ R . We can see that linear scaling15s true as ( t ) → aξ ( t ) by multiplying a to both sidesof Eq. (3) in the main text and re-grouping terms. Ad-ditivity as ( t ) + bs ( t ) → aξ ( t ) + bξ ( t ) can be shownsimilarly. Time invariance , which states that the response to adelayed input is identical and similarly delayed: if s ( t ) → ξ ( t ) then s ( t − t ) → ξ ( t − t ).We wish to show the following for Eq. (3) of the maintext: ξ ( t − t ) = µs ( t − t ) + C Z t ˆ τ − (1+ θ ) ξ ( t − t − τ ) dτ After a change of variable t = t − t , we can see that theLHS is ξ ( t ). For the RHS, ˆ τ remains unchanged, the restis: µs ( t ) + C Z t + t ˆ τ − (1+ θ ) ξ ( t − τ ) dτ We write the integral into two parts, i.e., (0 , t ) and ( t , t + t ). µs ( t ) + C Z t ˆ τ − (1+ θ ) ξ ( t − τ ) dτ + C Z t + t t ˆ τ − (1+ θ ) ξ ( t − τ ) dτ We note that ξ ( t ) is a causal function, i.e., ξ ( t ) = 0 for t < 0, or ξ ( t − τ ) = 0 for τ > t . The second termvanishes. RHS becomes µs ( t ) + C Z t ˆ τ − (1+ θ ) ξ ( t − τ ) dτ Note LHS = RHS due to Eq. (30) and time invarianceholds.The main text Corollary 2.2 concerning the LTI prop-erty directly implies the following about Hawkes intensityprocesses, as illustrated in Fig. 2(b) of the main text. • Additive effects from multiple sources of externalstimulation : when applying two sources of excita-tion, the event rate of the resulting Hawkes intensityprocess is the sum of the rates generated by eachsource of excitation independently. This allows us toseparately quantify the impact of each source. • Scaling the expected event rate : if the exogenousstimuli scales up or down, the endogenous reactionwill scale accordingly. In other words, if we can con-trol the amount of exogenous promotions, we couldboost or suppress the number of views for videos thatrespond to such promotions. • Shifting in time : if the exogenous stimuli is shiftedin time, so will the views responding to it. In otherwords, we could schedule promotions (and subse-quent views) for videos that respond to such pro-motions. Proof of main text Lemma 2.3 The sliced fittinggraph in Fig. 2(b) of the main text can be understood asan illustration of these three properties. In reality we ob-serve the exogenous stimuli s ( t ) as a discretized function(denote discrete time index as [ t ]) consisting a series ofimpulses located at τ = 1 , , . . . , T , i.e. T X τ =0 s [ τ ] δ [ t − τ ] (32)Directly following from the three properties, we can seethat ξ [ t ] is a superposition of impulse response func-tion ˆ ξ [ t ] scaled by s [ τ ] and shifted by the correspondingamount, i.e. ξ [ t ] = T X τ =0 s [ τ ] ˆ ξ [ t − τ ] (33)Visibly, Eq. (33) is Eq. 8 in the main text Lemma 2.3. This section describes some of the implementation andcomputational details for estimating the model in Eq. (30)from observed popularity and promotion histories. In this section, we develop the calculation of the loss func-tion defined in main text Eq. (6). For each video with ob-served { ¯ ξ [ t ] , ¯ s [ t ] , t = 1 , . . . , T } , we find an optimal set ofmodels parameters { µ, θ, C, c } and also estimate the un-observed external influence (parameters γ and η ). Thisis done by minimizing the square error between the series¯ ξ [ t ] and the model ξ [ t ] , ∀ t ∈ , . . . T . The correspondingoptimization problem is as follows:min µ,θ,C,c,γ,η J = 12 T X t =0 (cid:0) ξ [ t ] − ¯ ξ [ t ] (cid:1) Eq. (30) , = 12 T X t =0 γ [ t = 0] + η [ t > 0] + µ ¯ s [ t ]+ C t X τ =1 ξ [ t − τ ]( τ + c ) − (1+ θ ) − ¯ ξ [ t ] ! s.t. µ, θ, C, c > ξ [ t − τ ] – as we will show in the Sec. 2.2, the objective functionand its gradients are computed iteratively by estimating ξ [ τ ] from ξ [1] , . . . , ξ [ t − ξ [ t − τ ] rather than observations¯ ξ [ t − τ ], as we would like to have the model reproducingthe whole observed time series, rather than predicting thenext point given observed history. As will be discussed inSec. 2.3, we further improve fitting stability by adding a L regularizer to the objective function.16 .2 Computing gradients Eq. (34) is a non-convex objective, we use gradient-basedoptimization approach, and specifically L-BFGS [9] withpre-supplied gradient functions. We use the implemen-tation supplied with the NLopt package [7] in R. We fiteach video in parallel, starting with multiple random ini-tializations to improve solution quality, and we presentthe solution with the lowest error function J . The gradi-ent computations are listed as follows.We define the error term as e [ t ] = ξ [ t ] − ¯ ξ [ t ], Eq. (34)now becomes J = P Tt =0 e [ t ]. Since ¯ ξ [ t ] are observedquantities, ∂e [ t ] ∂var = ∂ξ [ t ] ∂var , Here var ∈ { µ, θ, C, c, γ, η } . Using chain rule, we obtain: ∂J∂var = T X t =0 e [ t ] ∂ξ [ t ] ∂var (35)Specifically, we compute the following partial deriva-tives and use them in Eq. (35) to compute the gradient. ∂ξ [ t ] ∂µ = ( ¯ s [ t ] + C P tτ =1 ∂ξ [ t − τ ] ∂µ ( τ + c ) − (1+ θ ) , t > s [0] , t = 0(36)for t > ,∂ξ [ t ] ∂θ = C t X τ =1 ∂ξ [ t − τ ] ∂θ ( τ + c ) − (1+ θ ) + ξ [ t − τ ] ∂∂θ ( τ + c ) − (1+ θ ) = C t X τ =1 (cid:20) ∂ξ [ t − τ ] ∂θ − ξ [ t − τ ] ln( τ + c ) (cid:21) ( τ + c ) − (1+ θ ) (37)for t = 0 , ∂ξ [0] ∂θ = 0 . for t > ,∂ξ [ t ] ∂C = t X τ =1 C ∂ξ [ t − τ ] ∂C ( τ + c ) − (1+ θ ) + ξ [ t − τ ]( τ + c ) − (1+ θ ) (38)for t = 0 , ∂ξ [0] ∂C = 0 . for t > ,∂ξ [ t ] ∂c = C t X τ =1 ∂ξ [ t − τ ] ∂c ( τ + c ) − (1+ θ ) − (1 + θ ) ξ [ t − τ ]( τ + c ) − (2+ θ ) (39)for t = 0 , ∂ξ [0] ∂c = 0 . For the unobserved external stimuli γ and η .for t > ,∂ξ [ t ] ∂γ = C t X τ =1 ∂ξ [ t − τ ] ∂γ ( τ + c ) − (1+ θ ) (40)for t = 0 , ∂ξ [0] ∂γ = 1 . for t > ,∂ξ [ t ] ∂η = 1 + C t X τ =1 ∂ξ [ t − τ ] ∂η ( τ + c ) − (1+ θ ) (41)for t = 0 , ∂ξ [0] ∂η = 0 . Note that the gradient computation is iterative, i.e. thecomputation of ∂ξ [ t ] ∂var makes use of previous values in itsown series ∂ξ [ τ ] ∂var for τ = 1 , . . . , t − L regularizer We add L regularization on the linear coefficients of theHawkes Intensity Process to avoid overfitting. The lossfunction with the regularization terms are as follows. J reg ( ω, µ, θ, C, c ) = J ( µ, θ, C, c )++ ω (cid:18) γγ (cid:19) + (cid:18) ηη (cid:19) + (cid:18) µµ (cid:19) + (cid:18) CC (cid:19) ! , (42)Here ( γ , η , µ , C ) are reference values for parametersobtained by fitting the series ¯ ξ [ t ] without regularization.The reference values are used to normalize the parame-ters in the regularization process, so that they have equalweights. Intuitively using L normalization in square-lossis effectively putting a Gaussian prior on the parametersbeing regularized. We desire parameters c and θ to takevalues away from zero, hence they are not regularized.The L regularization term is differentiable with respectwith variables ( γ, η, µ, C ) and the terms ωγ , ωη , ωµ and ωC are added respectively to the RHS of Eq. (40), (41),(36) and (38).The regularizer parameters ω is expressed as a percent-age of J (the value of the non-normalized error func-tion) and it is determined through a line search within1710 − J , J ] in log-scale. ω is tuned per video, on atemporally hold-out tuning sequence, i.e. we use the first75 days of observed popularity for parameter estimation,the next 15 days for tuning ω , and day 91-120 for fore-casting popularity. It is informative to discuss the properties of the modelestimation procedure above in terms of model properties,and the optimization procedure.The Hawkes intensity model in Eq. (30) is a non-linearintegral equation. It is worth noting that there are twonon-linear parameters θ and c , and the rest are linearparameters – µ and C , as well as the unknown exter-nal stimuli η and γ . Given θ and c , the loss function inEq. (34) is convex, and the optimization procedure con-verges to a global optima. Assuming a set of fixed (orknown) non-linear parameters for the whole dataset istherefore convenient for fast estimation, and is used in re-cently literature such as by Zhao et al. [16]. On the otherhand, our own recent study [11] shows that in addition tobetter interpretability, there is a performance advantageof estimating both the linear and non-linear parametersin estimating Hawkes point processes. Therefore we esti-mate all of the linear and non-linear parameters for theHawkes intensity process.The procedure for minimizing the squared loss inEq. (30) uses a standard gradient-based non-linear contin-uous optimization routine. The procedure will convergeto a local minima in the loss function when it terminates.We implement standard random restarts to improve thesolution quality, i.e. perform the optimization from 8 dif-ferent random starting points for each video, choose theone with the lowest loss as the final result. On the theo-retical end, there is no known results of convergence ratesas a function of sequence length (or sample size) for thisclass of models. In practice, the primary limiting assump-tions is the model being stationary (and fixed) over timeand over different parts of the activated online social net-work. Setting up the Twitter crawler We construct a“Tweeted Videos” dataset using the data APIs from bothTwitter and YouTube. We stream tweets from TwitterAPI using the set of keywords related to YouTube and itsvideo: "youtube" OR ("youtu" AND "be") . The Twit-ter filter API returns the tweets for which the keywordswere matched in at least one of the considered fields, in-cluding in the textual description and the expanded url field. Twitter API claims that the expanded url fieldcontains the original URL of all URLs shortened usingshortening services (such bit.ly ). While this happens in most of the cases, we found that a non-trivial numberof URLs remain shortened. In these cases, the Youtubevideo ID is hidden, if the URL is a link towards a Youtubevideo at all. Expanding these links ourselves is unfea-sible, given our network and service constraints. Onenoteworthy exception is Youtube’s own shortening ser-vice ( youtu.be ) which readily contains the video ID. It isfor this reason that we added "youtu" AND "be" to thefilter keywords.This returns over 5 million matched tweets per day af-ter URL expansion and tokenization performed by Twit-ter, most of which mention and link to a YouTube video.The raw dataset used in this study was from 2014-05-29to 2014-12-26, having 1,061,661,379 tweets in total. Fromeach of these tweets, we extracted the associated YouTubevideo id (only the first in case multiple videos were refer-enced in the same tweet), resulting in 81,915,174 distinctvideos in total. Setting up the Youtube crawler FromYouTube.com, we obtained for each video its meta-data, including the upload date and video category, aswell as the time series consisting of the daily number ofviews and shares, from the day of upload and until thedate of crawling. We aggregate for each video the numberof tweets it receives every day and we obtained threeattention-related time series for each video: ( views [ t ], shares [ t ] and tweets [ t ]), here t indexes time with unitof a day. For the number of views and shares, thetime range is from the video’s upload date to the datacollection (i.e. February-March 2015). For numberof tweets, time ranges from the videos upload date or2014-05-29 (whichever is later) to 2014-12-26. and Active datasets We constructed two cleaned data subsets from the feedof tweeted videos, in order to collect basic data statisticsand estimate the model. • The was constructed to have videos whose pop-ularity history is at least 60 days long, and is used forforecasting popularity. We narrow down the time-frame of video upload to between 2014-05-29 and2014-10-24 in order to have long enough history.There are 16,417,622 videos with publicly-availablepopularity history. We did not obtain the popularityhistory for more than half of the videos, reasons forsuch data loss include: a video is no longer online, avideo’s popularity history is not publicly-available, orrequests that resulted in web server errors. This largeand diverse sample allows us to estimate the back-ground statistics of video views, tweets, and shares,as will be discussed in the next subsection. • The Active dataset selects videos uploaded between2014-05-29 and 2014-08-09, and which have received18able 1: Number of videos broken down by category inthe Active dataset. Music represents a significant pro-portion (25%) of all the videos.Category Ac-tive dataset. It is noteworthy that the largest 4categories cover more than 70% of all the videos in Active , with more than 25% of the videos being Music . We removed 6 content categories (i.e. Autos& Vehicles , Travel & Events , Pets & Animals , Shows , Movies , Trailers ) containing less than 1% ofthe videos in the dataset. Their corresponding videoswere also removed. The resulting dataset contains13,738 videos. It is well-known that network measurements such as thenumber of views follows a long-tailed distribution. Tofacilitate discussions about popularity and attention, wepropose to quantify popularity on an explicit percentagescale, with 0 . 0% being the least popular, and 100% be-ing the most popular. Videos are grouped into PopularityBins by viewcounts that they receive at t days after up-load, and each bin ξ t ( k ) is marked with its maximum pop-ularity percentile – videos in bin k are at most among thetop k % popular with age t . In this work we use 40 evenlyspaced bins, i.e. , k = 2 . , . , . , . . . , ∼ .Fig. 8(a) and (b) contain boxplot of video viewcounts(in log-scale) of each bin after 30 and 60 days, respec-tively. We can see the long tailed distribution of popular-ity in YouTube reflected here – videos in the less popularbins have very similar number of views, e.g. the first 6bins, or 15% of the videos, all have less than 10 views; on the 5MO popularity scaleThe ACTIVE dataset Popularity percentage F r equen cy Figure 7: Positioning of the Active dataset on the popu-larity scale of the dataset (at 30 days after upload).The horizontal axis shows the popularity percentiles inthe dataset, while the vertical axis shows the corre-sponding frequency of videos in Active . Visibly, Active is a subset of the most popular videos in .videos in each the middle bins ( e.g. k = 17 . , . . . , . k = 97 . k = 100) videos span over almost two orders of magni-tude. For videos in at 30 and 60 days after upload,the shape of the overall popularity scale remains the same,with a slight increase in the dynamic range of views (topof the last boxplot). The popularity scale of the Active is very similar to the one presented in Fig. 8(a) and (b),the only notable difference being the number of views cor-responding to each bin. Active is a subset of the mostpopular videos, as shown by Fig. 7: the videos in Ac-tive are positioned in the top 5% popularity percentilesof ( k = 97 . k = 100).In Fig. 8(c) we explore the change of popularity of eachvideo from 30 days (y-axis) to 60 days (x-axis). Note thatmost videos retain a similar rank (in the boxes along the45 degree diagonal line), or have a slight rank decreaseas they are overtaken by other videos (slightly above thediagonal in the plot). No outliers exist in the upper-leftpart of the graph, since a video cannot lose viewcount thatit already gained. Most notably, we can see that videofrom any bucket can jump to the top popularity bucketsbetween 30 and 60 days of age, such as the outliers forthe few boxes on the far right. This phenomenon elicitsimportant questions: how did these videos do viral, andwhether or not it is related to external promotions. In this section, we provide additional observations on theparameters the HIP model, supplementing the analysispresented in the main text Sec. 4. Specifically, we relatethe distribution of specific parameter values such as mem-ory exponent or exogenous sensitivity, to video groups –channels, content categories – and a video’s popularity.19 opularity 2.5% percentiles at 30 days Popularity percentile v i e w s % % % % % % % % % % % % % % % % % % % % (a) Popularity 2.5% percentiles at 60 days Popularity percentile v i e w s % % % % % % % % % % % % % % % % % % % % (b) Evolution of popularity percentiles Popularity perc. at 60 days P opu l a r i t y pe r c . a t da ys l ll lll ll ll l lll l ll llll ll l l ll ll l llll l l lll lll lll l lll lll lll ll ll l l lll ll lll ll ll l ll ll lll l l l lllll ll ll ll l ll l lll ll lll ll lll ll ll l l l ll l l ll lll l ll l lll llllll ll l ll l lll lll lll ll ll ll ll l llll ll lllll l ll ll lll l lllllll l l llll l ll ll ll l l ll ll l llll lll lll l ll ll l lll l l l ll l l ll llll lll lll l ll ll l lll ll l l ll ll llll lll ll ll l ll l l ll l ll ll l ll llll l l ll l lll l llll ll lll l lll ll lll l ll ll llll l llll lll llll l llll l lll l l ll l ll lll l ll ll lll ll lll ll ll l ll ll ll lll ll ll ll l lll ll lll l ll l llll lll ll l lll l lll llll l ll l ll lll l ll lll ll ll ll l lll ll lll ll l ll l lll ll ll ll ll ll l ll ll l ll lll l lll ll llll l l lll lll ll ll ll l lll ll lllll lllll ll ll ll l l l llll ll lll lll llll l l lll l l ll ll ll l lll ll l ll ll ll l l ll ll l l lll l ll l l ll lll ll llll ll lll ll ll ll l llll ll ll ll ll ll llll ll l ll lll l l lll lll ll ll ll ll ll ll ll lll ll llll l ll l ll l ll lll ll llll l l lll ll ll ll ll l lll ll l ll ll lll l l l lllll llll l lll llll ll l ll ll ll lll lll l lll l ll ll lll ll lll l lll l lll ll ll lll ll ll l ll ll ll ll l lll l lll ll ll ll l l lll llll l l ll lll ll ll l lll l ll lll l llllll lll ll l llllll ll l ll ll l l lll lll llll ll l l llll ll ll llll l lll l ll l ll l ll ll ll ll l ll l l ll lllll ll ll ll ll llll llllll ll ll l llll l lll ll l ll lll l l ll ll ll lll l l llll ll ll l ll lll ll l ll l lll ll l l ll ll l lllll ll lll lll l lll lll ll llllll ll lll ll l ll ll lllll l ll ll l lll lll ll ll ll ll ll ll ll ll l llll lll ll l lll lll l lll l ll l ll lll l lll l l lll l l lll lll llll ll l ll ll l ll l ll lll llll l l l ll l ll l lll ll ll ll l ll l lll ll ll l l l ll ll ll ll ll l ll ll ll lll lll llll l ll l lll ll ll lll l lll ll l ll ll lll ll lll lll lll l l ll ll ll l llll l lll lll l llll ll ll l ll l ll ll lll ll l ll ll lll l lll ll ll l l l lll ll ll l ll l l lll l ll l l ll llll lll l l llll l l lll l l lll l ll llll l ll llll l l l ll ll ll ll ll lll ll ll ll ll ll l l ll llll l ll lll lll ll l lll ll llll l ll llll lll ll lll ll ll lll ll ll ll ll llll lll l ll ll ll lll ll ll l ll ll lll l ll l l llll ll ll ll lll ll lll l ll l l ll l lll l ll l l ll l ll ll ll ll ll l ll ll ll l ll l lll ll ll ll llll lll l ll lll l lll lllll ll l ll ll l ll ll ll ll l lll l ll lll ll l llll l l lll ll llll lll ll ll l ll ll ll ll l ll lll l ll lll ll ll ll ll ll ll ll l ll lll l ll l llll llll llll ll l l ll l ll ll l ll l lll ll ll l ll l l l ll l llll ll l ll lll ll l lll ll ll llll l lll l l ll llll ll ll llll l ll l ll lll l ll ll llll ll llll lll lll l l ll l ll l lll ll ll ll ll l lll ll ll lll lll l lll llll lll ll lll ll llll lll l ll lll ll ll l ll l l ll l ll lll l l ll ll l ll lll llll ll l lll ll l ll ll ll l lll llll lll l ll l ll lll lll ll ll ll ll l l llll l ll lll ll ll lll l ll l llll l ll ll llll l ll lll ll l ll ll lll lll llll lll ll lll l ll ll ll llll l ll l ll ll ll l ll lll ll ll ll l lll lll ll ll ll llll ll ll l ll l lll ll lll lll lll l ll l ll ll lll ll ll lll l ll l l l ll lll ll lll ll l ll l l l llll lll l ll l ll ll ll l lll ll ll lll l lll ll l ll l l l lll ll ll ll lll ll ll l ll llll l lll ll ll ll l l lll ll ll ll llll lll ll ll l ll ll llllll l ll l lll lll ll ll ll ll lll lll ll l ll lll ll ll ll ll ll ll l ll ll l l ll l lll lll ll ll lll l l lll l l ll lll ll ll ll ll ll l lll ll ll lll l l ll ll ll l l l ll llll ll lll lll ll ll ll l lll l lll l ll ll l l l l llll l lll l ll ll l ll l l ll llll ll ll ll lll lll l ll ll llll lll lll ll l l l ll ll ll l llll ll ll ll llll lll lll ll ll lll l ll ll llll lll ll ll ll l ll ll l ll lll ll lll ll lll lll lll l ll lll lllllll ll l ll ll lll llll l l l llll l lll lll ll ll ll ll ll ll l ll l llll ll lll l llll l l lll lll ll l lll l ll l lll l l ll ll ll l ll l ll lll ll ll l ll ll ll lll ll ll l ll ll l lll lll l lll lll l lll ll lll llll l lll ll ll l ll ll l llll ll lllll ll lll ll l lll lllll ll ll l ll ll l lll ll l ll ll l lllll lll l ll ll l ll lll l ll l ll l lll ll l ll ll ll llll l ll l l lll l ll l l lll l lll ll ll ll llll llll ll ll l l ll ll ll ll lll ll l l llll lll ll lll l ll ll l l ll ll l ll l ll l ll llll l ll llll l l lll l ll l ll l ll ll lll lll l ll ll lll l llll ll lll ll ll l l ll lll llll l lll l lll l l l ll ll ll ll l ll lll l ll ll llll l ll ll l lll lll ll l ll lll lll lll ll ll lll ll ll l lll l ll l ll llll ll ll ll lll lll ll ll l lll ll l ll ll l l ll ll ll l lllllll l l l llll llll l lll ll l ll l ll lll ll ll l ll ll ll lll l lll l l lllll lll lll l lll ll lllll l lll ll ll l lll lll ll l ll ll lll l ll ll ll lll lll ll ll llll ll lll l ll lll lll lll ll lll ll ll ll ll ll ll l lll llll lll ll lll llll l l ll lll l ll ll ll l ll ll lll ll ll llllllll l llll l lll l llll ll llll lll lll l l ll ll l l lll l l lll llll l ll l l l ll ll lll l l l ll llll lll lll lll l l lll ll l lll lll ll ll llll l l l ll ll ll l lll l ll ll lll l llll ll ll ll ll ll lll llll l lll ll l lll lll l ll ll ll l ll ll ll llll l lll l ll lll l ll ll llll lll lll ll llll l l lll ll l ll lll ll l llll lll lll ll lll ll l ll l ll l ll ll ll lll ll ll ll ll ll ll lll ll lll l ll ll ll lll ll ll ll lll l ll l lll l l ll l l ll ll l ll ll ll ll l ll llll lll ll lll ll ll ll l ll ll ll lll l llll l lll ll lll lll ll ll ll llll l lll lll l ll lll l llll lll l ll ll l lllll l lll l lll ll l lll ll ll l ll lll l ll l ll ll l ll ll l llll ll lll ll l llll l ll l l llll l ll ll lll llll ll llll l l l lll ll l lll l ll l lll ll lll ll l lll ll l l ll llll l lll ll llll lll lll ll ll l ll ll l lll ll ll l l lll l llll ll l ll l lll llll ll l lll ll l l l ll ll l l ll l ll ll ll l lll lll llll l ll ll ll l l l l lll l l l ll l ll l ll ll llll ll l ll llll ll lll l l ll ll l ll ll ll lll ll l ll l l lll l lll ll ll ll ll l ll ll l l ll lllll l l lll l lll ll llll ll l lll ll ll llll lll lll ll lll l l ll ll ll lll ll l lll ll ll ll l ll llll ll ll l lll l lll ll ll l ll l ll lll ll l ll l l ll lll l ll llll ll lll lll lll l ll lll l ll ll lll l ll l ll l l ll l l ll ll ll lllll ll ll ll l l lll ll l ll ll l l l llll ll l lll ll ll ll lll l lll l ll lll l lll ll ll ll llll l l lll l lll l lll l llll ll ll l ll l ll ll l llll ll l l l ll lll ll lll ll ll l ll l ll l lll ll l ll ll l llll l l ll l l l ll ll lll lll l ll lll ll l ll l ll lll ll lll l ll lll l ll ll ll lll ll ll ll l l llll ll lll ll lll llll l lll l ll ll ll ll ll l ll lll ll l l lll lll l ll ll ll lll l l llll l ll lll ll ll lll lll lll l ll l llll l ll l ll lllll l ll lll l ll l l ll ll ll l lllll l ll ll ll l l llll l ll ll ll ll lll ll l l ll l l lll llll lllll lll ll l llll ll lll ll ll lll ll lll ll ll lll l l ll ll llll ll ll lll ll lllll ll ll llll ll l l lll lllll ll l lll lll l lll l lll llll l lll l l ll ll ll l ll ll ll ll lll ll ll lll ll l lllll l lllll llll ll l ll lll l ll ll ll lll lll ll lll ll ll ll ll l ll ll lll ll ll l lll l ll ll l ll ll lll ll l lllll ll l ll l l ll ll ll ll lll ll ll ll lllll lll lll lll l ll lll l ll lll ll llll ll l lll l lll l ll ll l ll llll l ll l l ll ll ll lllll l lll ll llll ll lll lll ll l llll lll lll l ll lll lll ll lll ll llll lll ll l ll l l l lll l l ll ll ll ll ll l ll l ll ll lll l ll l llll l ll l l l ll ll lllll ll llll l ll ll ll l llllll llll l ll lll ll lll l l ll llll ll l l ll lllll l ll ll llll l ll lll l l llll lllll l ll l llll ll lll ll lll l ll l ll ll ll ll l ll ll ll llll l lll l ll l ll ll lll lll l l lll l l ll lll ll lll l ll lll l ll l l lll lll l ll ll l ll l ll l ll lll l lllll lll ll lll l ll ll lll l lll ll lll l l ll ll lll lll l llll lllll ll l l ll lll lll l l lll llll l lll ll ll l ll l ll lll ll lll lll llll ll ll lll lll ll l l l ll ll ll llll l l ll l llll ll ll l ll l lll ll ll l l ll l l ll ll ll lll ll lll l ll l ll ll ll lll l ll ll ll ll ll llll l lll ll ll ll ll lll ll l ll ll ll l ll l lll lll l l l ll ll l ll ll ll ll ll ll lll llll l ll l lll ll l l ll ll lll l l lll ll ll lll lll l lllll lll lll ll l ll ll lll ll ll ll llll l lll llllll ll llll ll l lll lll ll lll ll l llll l ll l ll llll ll l ll llll ll ll lll l l l lll l lll ll ll l l ll ll lll lll l lll ll lll lll lll ll ll ll l ll lll ll ll l ll llll l lll ll lll l ll l ll ll ll llll l l ll ll llll lll ll ll l ll l ll l ll ll l l ll ll ll ll ll lllll ll ll lll l lllll l ll l lll l ll lll l l ll l ll ll lllllll l ll lll l ll lll ll ll l ll ll l ll lll lll lll ll ll ll ll l lll lllll ll ll ll l ll llll lll l lll ll ll ll ll ll l lllll l l lllll ll ll l l lll lll ll ll lll lll ll ll l llll lll l llll ll llll ll l llll llll l l ll ll ll ll l ll ll lll ll l lll l lll ll l lll ll l ll lll l ll llll l l ll l l ll ll l llll llll ll lll lll lll ll ll lll ll ll ll lll l lll ll ll lllll l llll l lll lllll ll llll ll ll l lll ll l ll ll l llllll l ll l l lll ll ll ll ll l lll lll l ll llll llll ll ll ll lll l ll ll ll ll lll ll ll l l lllll ll lll l ll l ll llll l lllll ll ll ll ll lll l l l l ll l ll l l l ll ll ll lll ll ll lll l ll l lll ll l lll lll l lll l ll ll ll ll lll llll l lll lll llll l lll ll l lll llll l l ll lll l lllllll ll ll lll lll l ll l ll l ll ll llll llll l l l llll l l ll ll llll ll l llll l l ll ll ll ll ll ll l lll l ll ll ll l ll lll l lll l ll lll l l ll l llll ll llll ll lll l l ll ll l lll l lll ll llll lllll lll ll lll lll l ll ll l ll ll l lll ll l l l lll lll l llllll l ll lll l lll l l l ll ll l lll l lll ll l ll ll ll l ll ll lll lll l ll ll lll llll l l ll l llll lll ll l lll ll l lll ll ll ll ll l l ll ll ll ll ll l l l l l lll ll l ll l ll l lllll l ll l ll ll lll l lll llll ll ll ll llll ll l l ll lll l ll l lll ll ll l lll l lll lll llll ll lll ll llll ll ll llll l llll ll l l ll ll l ll l ll ll l ll ll ll ll ll l lllll l ll ll ll llll lll lll l ll l ll lll llll ll l l lll l l l lll lll ll ll ll ll ll ll l ll llllll l ll lllll l lll l ll ll ll l lll l lllll l ll ll l ll l lll l llll ll lll l lll ll ll ll ll ll l l lll l ll l ll ll l lll ll l l lll lll ll l ll ll l ll l ll ll l ll ll lll ll ll ll l l llll l llll ll l lllll ll lll l lll l ll lll lll l l lll ll ll l ll l ll lll lll ll ll l ll lll ll ll ll llll ll l l ll l ll ll lll ll l ll l ll lllll l l llll ll ll ll ll l lll ll ll ll l llll ll lll ll ll ll ll ll ll lll lll ll lll l ll l ll ll llll l lllll lll lll lll llll ll l l ll ll lll l l ll ll ll l ll ll l ll lll ll l ll l ll ll l llll l l l l ll ll lll l lll lll ll l l lll l lll ll lll lllll l ll l ll l lll lll ll l ll ll l ll ll ll l ll l lll ll lll l l l lll l ll lll l ll ll l ll lll ll llll l ll l l lll ll ll l ll lll ll ll ll l ll ll lll l ll lllll ll l llll l ll l lll ll lll ll l l lllll l l lll lll lll l ll llll lll ll ll l ll ll l ll l lll l l ll ll ll l ll lll ll ll lll l ll ll ll l lll l l llll ll ll l lll lll l ll lllll ll l ll lll llll l ll lll ll ll lll ll llll lll ll lll lll l ll lll ll l l ll lll lll ll l ll l lll l l llll ll lll ll llll llll ll ll ll l llll l ll l ll l ll ll ll ll llll ll ll lll l lll ll ll l ll lll l ll ll ll ll llll lll ll l ll l l ll l ll ll ll ll ll l l lll l lll ll ll ll l ll l l lll ll ll ll lll ll l l ll l ll ll lll l l lll l llll l l ll l l ll l llll lll ll ll ll l l lll llll l l l lll l lll l llll l llll ll l lll ll ll l ll l lll ll ll lll ll ll llll l llll llll ll ll l lll ll l ll l ll ll llll ll ll lll l ll ll llllll lll lll l ll lll lll ll lll ll ll ll ll ll ll l ll ll llll lll ll lll ll ll l lll ll l l ll lll l lll l l ll ll ll l lll lll ll l l llll l lll ll lll ll ll ll ll ll ll ll l ll l ll l l ll llll ll l lllll ll l ll l llllll l ll lll l ll ll llll l ll lll ll ll ll l llll l l ll ll lllll l ll l ll ll ll ll l llll l ll ll lll lll l ll ll l l ll l ll ll ll ll l lll lllll ll ll l ll ll lll ll llll lll l lll lll l lll lll ll ll ll ll l l ll l lll ll l ll ll l l ll l llll lll l ll l ll ll ll ll l l ll ll ll ll ll lll lll lll ll lll l ll ll ll llll ll ll l ll l lllll l ll ll ll l ll lll l lll ll ll lll l llll ll lll ll ll ll ll llllll l ll ll ll l l ll l lll llllll ll l lll ll lll l ll llll l lll ll ll l ll ll ll ll ll l ll ll l ll ll l ll l ll ll l ll l ll l lll ll lll ll l l lll l ll llll ll lll l llll ll l lll l l ll ll l ll ll lllll l lllll ll ll llll lll lll lll l l ll ll l lll l l lll l ll lll l l ll ll ll ll ll ll l ll ll ll l ll l llll ll ll l l lll llll l ll ll l lll l ll ll lll l ll ll l ll ll lll l lll l l l ll ll l ll ll ll l l ll l l l lll ll lll lllll ll ll l lll lll ll ll ll ll ll lll l lllll ll l l l ll l ll lll l ll ll l l l ll l ll l ll lll l l lll ll l ll l llll l l l ll ll ll lll lll lll lll ll ll lll l l l ll ll lll ll l lll ll lll lll lll ll ll llll ll llll ll l l lll ll ll ll ll llll ll l lll ll ll ll ll lll lll l ll lll l l ll l lll l lll ll ll l lll lll ll lll lll lll lll l ll l l llll l lll lll lll l lll ll ll lll ll ll ll lll ll l lll l ll llll l ll l l ll l ll ll lll ll l ll lll l l lll l ll ll ll l lll lll l ll l lllll l ll ll l ll ll lll ll l l ll ll ll l ll l ll l lll lllll ll ll ll lll llllll lll ll lll l l ll l lllll ll l llll l ll lll lll ll ll ll l l ll l ll l ll llll llll ll l ll l ll llll ll ll l lll l ll ll llll ll l lll l lll lll llll ll ll ll ll l l ll ll ll l lll ll ll ll llll lll ll ll ll l ll l ll l ll l ll ll l llll ll ll ll ll ll l ll l l l ll l ll ll l l ll ll llll ll ll llll lll l lll l ll l ll ll ll l ll lll ll ll lll lll lll ll l l llll lll l ll l ll ll lll l lll ll ll lll l l ll ll ll l ll llll l ll ll l l ll l ll l lll l l ll l ll l ll ll l l ll llll ll l lll lll lll lll ll l lll l ll ll l lll l ll lll lllll ll ll l ll l l ll l llll ll l llll lllll ll l ll ll ll llll l lll lll ll l l llll l l lll ll lll llll ll ll ll l ll lll l ll ll l lll ll ll ll l llll l ll l lll lll ll llll ll ll ll lll l l ll lll l lll l llll ll l l lll ll ll l lll lll ll l l lll l ll ll llll lll lll ll llll lll l ll lll ll l ll ll lll l lll l lll l l ll l ll l lll l ll ll l ll ll ll l l lll lll ll ll ll llll l llll ll ll l llll ll ll ll l lllll lll ll l ll ll lll l llll lll l llll lll ll lll l l l lll l l ll ll lll ll lll ll lll l ll llll l lll lll l ll l lll ll l ll ll ll l ll ll l ll l lll lll ll l lll ll l lll lll lll ll l ll lll ll lll llll ll ll l ll l lll l l llll l ll l l llll l l lll l lll ll llll lll l ll ll l lll ll ll l ll ll ll lll ll l llll l ll ll ll ll llll l lll l lll ll ll l ll ll l ll llll lll lll ll l ll lll l lll ll llll ll ll ll l ll l lll l lllll l ll llll ll l l l ll ll l ll ll lll ll l ll l l ll l llll l llll l ll lll ll ll l lll ll llll llll lll l ll lll l ll ll lll lllll ll ll ll ll ll ll l ll lll lll llll l l ll l lllllll ll l ll ll ll ll lll ll llll lll l lll ll ll ll lllll l ll ll ll ll ll lllll ll l l lll l ll lll l lll l lll l lll l l l llll l ll lll ll ll ll lll lllll l llll l l l llll ll lll ll ll ll ll llll ll ll ll ll l ll ll l ll ll ll ll ll lll lll lll l ll lll llll ll l l ll l lll l ll l lllll l l ll lll ll l l ll ll ll l ll l ll ll llll ll ll l ll l l lll ll lll ll l llll l lll ll lll ll lll l l lll l ll l lllll l l lllll ll l ll l ll lll l ll ll ll l lll l l ll l lll llll lll l ll lll ll l llll l l ll l lll ll l lll l llll lll ll ll lll l lll lll lll l ll lll l ll l ll l ll l l ll ll l ll lll l ll l ll l lllll lll ll l ll ll l ll ll llll ll l ll ll l ll l ll lll ll ll lll ll ll ll lll ll ll lll ll ll l ll lll ll ll ll lll ll ll ll l llll lll ll llll l lll l ll ll ll ll l lll ll l ll ll l ll ll l l lll llll l ll ll l ll l ll lll ll ll lll ll ll llll l ll l llll lll l lll l ll ll l ll ll l lll ll l lll ll l lll lll l l ll l llll ll l lll lll ll lll l ll lll lll lll l ll ll ll l lll ll ll ll lll l l ll l lll llll lll llll l l ll ll lll ll llllll l lll l ll l l ll lll ll lll l ll ll l ll ll l l ll l l lll l lll ll l lll l l l l ll l ll ll lll l l llll l l llll ll l ll l lll ll l lll l ll lll ll lll llll l ll ll lll lll l l ll ll lll ll l llll ll ll lll ll ll ll lll ll ll l ll ll l lll ll lll l l ll ll l llll l l l llll ll ll ll ll lll lll ll l ll llll ll ll l l ll l ll ll l l ll ll l lll ll lll l l llll llll ll l ll ll l ll ll l ll l lll llll l lll lll lll lll l llll ll l ll ll l ll ll l l l ll ll ll l ll ll l lll ll ll lll ll ll ll ll lll ll lll ll ll l ll ll lll l ll l lll l l ll lll l ll ll ll l ll ll ll ll l l ll l l ll l ll l lll lll llll ll lll l lll l l l lll lll l l ll l ll l l ll l ll lll ll lll ll ll ll l ll lllll l lll l l l lll ll lll l l llll ll lll lll l ll l l lll ll lll ll l lll ll lll ll ll l ll l lllll ll l l lll l ll l l ll l lll l ll l l lll lll ll ll l ll ll ll l l l l lll l ll ll lll ll llll l lll l llll ll l lll ll ll ll ll l ll ll ll ll l ll l llllll ll ll lll ll ll lllll ll l lll l ll l l l l lll l l llll ll ll ll l lll l ll l ll lll l lllll ll l ll lll ll ll ll ll lll l ll ll ll ll l ll lll l ll ll ll ll ll lll l l llll l ll ll lll ll ll lll l ll l ll l ll llll llll l l l ll lll l llll l ll llll lll ll lll ll l lll l l l ll l ll lll ll ll lll ll l l ll ll lll lll l l ll l l ll lll ll l l lll lll l ll llll llll l ll lllll ll l ll ll ll llll lllll lll l l ll l ll ll ll llll ll ll l ll ll ll ll lllll ll lll lll ll l l ll ll l l lll l lll ll ll lll ll l lll l ll l ll l l ll l l llll ll ll ll ll ll ll ll lll l l ll ll ll lllll ll l ll l l lll l ll ll l l ll llll l ll lll l l ll l ll llll l ll ll ll l ll l l ll l ll ll ll ll l ll ll lll ll ll ll l l ll ll l ll ll llll ll lll ll ll ll l ll l lll l l ll l ll l ll ll ll ll lll l l l l l lll l lll ll ll ll ll lll l ll ll ll ll ll ll l ll ll ll ll ll llll ll lllll lll ll llll ll ll ll lll lllll l ll ll ll l lll ll lll l llll l ll ll ll l lll ll ll ll l ll ll llll l lllll ll ll ll l lll l ll ll ll ll ll ll l llll l ll ll ll ll lll llll ll ll ll ll l l l llll llll lllll lll ll lll ll ll l ll l ll l l llll llllll l l ll l lll l lll l ll ll ll l l ll l ll lll l ll ll l lll l ll ll ll l lll lll l l ll ll llll ll ll ll ll l l ll l lll lll ll ll l l lll l ll l ll l ll l ll lll l l lll lll l ll l ll lll l ll ll ll lll ll ll ll l ll lll l ll lll lll l ll lll lll ll ll l l llll llll llll l ll lll llll l ll ll ll l ll llll l lll ll lll ll ll ll l l ll ll ll l l ll l ll l lll l llll l ll l llll l l ll l l l ll l ll l lll llll ll ll ll ll lllll lll llll ll ll lll l lllll lll l lll l lll l ll lll llll ll ll ll l lll l l ll ll ll l ll lll lll l ll ll l lll l ll l ll ll ll ll lllll l lll llll ll ll ll ll ll lll l lll l ll lll ll ll lll l ll ll lll lll ll lll l ll ll l lll lll l lll l l ll llll ll lll ll ll llll ll ll l l ll ll l lllll ll llll lll lll l lll ll ll l ll lll ll ll lll l lll llll l ll ll l l l ll ll l l ll l ll ll l lll ll lll l ll ll ll ll l l l ll ll l lll l lll ll l lll l ll l ll ll ll ll ll lll lll ll lll l ll lll l ll l llll l llll l ll llll lll ll l ll ll ll l llll ll lll ll l ll lllll l llllll l l l ll ll ll ll ll ll llll ll llll l ll l l l ll ll ll l ll ll ll ll ll llll ll llll l ll ll l lll ll ll lll ll l lll lll l ll lll l ll l lll l lll ll lll l llll l l lll l ll lllll l l l llll ll ll lll ll ll ll ll l ll ll l l ll l llll ll l ll l lllll l lll llll ll llll l l lll ll ll l lll ll l l ll lll lll l ll l lll ll l ll l lll lll llll lll ll l l ll l lll ll lll ll ll ll l lll l ll ll ll ll ll ll llll l ll ll l lll ll l ll l ll ll llll ll l l l lll l ll lll lll l ll ll llll l l ll ll lll l l l llll l lll lll ll lll ll ll ll lll ll l ll l ll llll ll l l l lll ll ll ll l ll l lll l ll ll ll ll ll ll ll lll ll ll ll l lll l ll ll l l l ll ll ll ll ll lll ll lll ll l ll lll lll ll ll l lll ll l l lll lll l lll l l ll ll l ll ll lll ll lll llll llll ll l lll llll llll ll l l lll ll lll ll ll l ll ll ll ll ll ll ll l ll l ll ll l llll l ll lll l l ll l l lll ll ll lll ll l ll l l ll llll ll l l ll lll llll lll ll lll lll ll l lll l l l ll l l ll l ll l lll l ll lll ll l lll ll lll llll ll ll lll l l ll llll l lll l ll ll lll ll l ll ll ll l lll lll l llll l lll llll ll ll ll l lll ll l l l ll l l lll lll ll ll ll ll l l l ll l llll ll lll lll llll l l lll ll l lllll lll ll ll ll l ll ll lll ll ll lll l llll l lll ll llll l ll l lll l llll lll l lll ll ll l l l lll ll ll l ll ll lll l ll l lllll ll lll ll ll ll l ll ll l lll ll l l ll llll ll llll l l ll l ll l l lll l ll lll ll ll l lll ll ll ll ll lll l ll l l ll lll l l ll l ll l llll ll ll l ll ll llll l ll ll lll l ll lll l ll ll lll ll ll l l lll ll ll ll ll l lll ll l l ll lll ll ll ll l llll l ll lll ll l l ll l ll l l l ll lll ll lll ll lll l ll ll ll llll ll ll ll ll llll lll ll ll lll ll lll l lll l l ll ll l lll ll l lll lll l lll llll l l l ll lll lll l l lll ll ll lllll lll ll ll l lll ll l lll l l ll l lll ll l ll llll l lll ll llll l l l lll l ll lllll lll lll l ll l l ll ll l llll llll l lll ll lll ll ll ll ll ll l ll lll lll l l ll l ll ll lll ll l ll l llll ll ll ll ll ll llll lll ll l ll llll l lll l lll l ll ll l ll l ll lll lll lll lll ll ll l ll llll ll lll l llll ll lllll ll lll l ll ll ll ll l ll lll lll ll l l ll l ll ll lll l lll lll lll llll ll lll ll lll l l ll lll ll l lll ll ll l lll lll ll ll llll l l lll l l ll ll ll ll lll llll ll l ll lll ll ll ll ll llll lll ll llll l llll ll l ll llll ll l lll ll ll l ll l l l ll l ll lllll l l l ll l lll l ll ll l ll lllll lllll ll lll l l llll l l llll l lll lll l l ll lll l l ll lll lll ll ll ll l ll lll l ll lllll lll ll l lll lll l l ll ll l ll ll lll l lll ll lll ll ll ll ll ll l lll lll l ll llll ll lll llll l lll lll l lllll l ll lll ll lll l ll ll l lll lll ll ll l lll l lll llll ll ll l ll l lll ll ll l ll l llllll l l l l lll l ll lll ll l llll lll ll l ll ll lll lll l l ll ll ll l lll lll lll l l llll l l ll l llll llll l l lll lll l lll l lll llll lll lllll lll ll lll lll lllll l lll l lll l lll ll ll l lllll l l llll l ll l l ll ll l l lll ll lll l ll l ll ll l ll lll ll lll llll l lll l l lll l l ll llll ll lll l ll lll lll l ll ll ll l ll ll ll ll lll l llll ll ll l ll ll l l ll lll l ll l ll lll ll l lll l lll ll llll lll lll l ll lll lll llll l ll ll l lll lll ll l ll ll ll l ll llll ll l ll ll l lll l ll l ll l lll l l lll ll ll l ll lll ll l lll ll ll lll l l ll l l ll l ll ll l l ll l lll lll llll ll ll ll lll ll ll lll l lll l ll l ll ll ll l ll ll ll ll ll ll l ll l ll l lll llll l ll lll l ll ll ll ll l ll ll l l ll llll llll llll l ll lll ll ll ll ll ll lll l l ll l l ll l l ll ll ll lll ll l l ll ll lll lll l l ll ll ll ll llll l l ll l ll l l lll l l lll ll llll l lll l l ll l llll l l lll ll ll l ll lll l lll ll l l l ll llll l llll l ll ll l ll lll llll l lll l l ll llll ll lll ll ll ll llllll ll l lll ll ll lll ll l l ll l ll l ll lll ll ll l ll lll lll ll l ll l ll ll lll l ll l lll l l l ll ll l ll ll l lll l lll ll ll ll lll l llll lll lllll l ll ll lll lll ll ll ll ll ll l l ll ll ll l lll l ll ll l ll ll lll lll ll ll l ll ll lll l ll ll l l ll lll ll ll llll llll llll ll ll l lll ll ll ll ll l llll ll ll l ll l ll lll lll lll l lll ll lll ll lll lll ll ll ll l ll ll ll ll lll l l l llll l l lllll l l lll l ll lll lll ll l lll ll lll lll lll l ll l lll llll l ll l llll l lll l lll l lll ll lll ll l ll ll l ll l ll l l ll lll lll lll l ll ll ll l ll ll lll lll ll ll ll ll llll l ll l l lll l ll lll ll ll ll ll lll l ll ll lll l l l ll llll lll lll l lll l llll lll l llll ll ll l lll lll lll l ll ll ll ll l l ll ll lllll lll lllll ll l l ll l llll ll l ll ll l ll ll lll l lll l ll ll lll l ll ll ll ll lllll llll ll ll l ll llll lll l ll llll lll lll l lll l lll ll l lll ll ll l ll lll ll lll ll ll lll ll ll l lll ll l ll l ll lll llll lll lll llll ll l ll l ll ll l ll l lll llll ll ll ll ll llll lll l lll llll lll ll l ll l lll lll ll ll l ll l ll lllll ll lll llll ll ll l llll ll llll ll ll l llll ll l l ll llll ll ll ll lll ll lll l l ll ll l ll lll lll l l ll l l lll lll l l ll ll lll lll llll l l ll l lll l ll ll ll ll l lll ll ll ll ll l l l llll l ll ll lll lll l ll ll l llll ll lll lll l ll lll ll ll ll ll ll lll ll l ll ll l l lll l ll lll l ll l ll l l ll lll lll l ll l l lll ll ll l l l ll ll lll ll lllll l ll lll l ll ll l l ll l lll lll ll lll ll ll ll lll ll lll lll lll ll lll lll l llll l l lll ll l llll ll ll ll lll l ll lll l ll ll ll llll ll ll lll ll ll ll ll lll lll ll ll ll lll l lll l ll ll l ll llll l l ll ll ll lll l ll l llll l l ll lll l lll lll l l lll ll l l lll ll ll l lll ll l ll l l lllll l l lll ll ll ll lll l l ll l l ll ll l lllll l ll ll l lll ll l ll l lll ll lll ll ll llll l lll l lll l lll lll ll l ll ll ll ll lllll l ll lll llll lll l ll l llll ll lll llll ll llll llll l llll lll ll l ll l l l lllll l lll ll lll l lll ll l lll ll ll ll ll lll l l l ll l ll ll l ll ll ll ll ll ll lllll l ll l l ll lll l ll ll l lll l ll l l llll lll ll ll lll ll l ll l lll l lll ll ll l ll lll l ll l ll lll ll lll l ll ll ll ll l llll lll ll ll ll ll ll ll lll lll l ll llll ll ll ll ll ll l l ll lll ll l ll ll lll l lll l l ll ll ll l lll ll l ll ll l l ll ll lll ll llllll l lll ll ll ll ll l l l ll l ll llll lll l ll l ll l ll ll l lll l lll lll lll ll l l ll l ll lll llll ll ll lll ll l lll ll ll ll lll lll l l l llll lll llll l ll lll lll l l ll l ll ll l l lll lll llll l l ll lll l l ll lll ll llll l ll ll lll ll l lll l l lll ll ll lllll ll lll l l ll llll ll llll l ll llll l ll ll l ll ll l ll ll l llll l l ll ll ll lll l l ll ll lll l lll lll l ll l l ll l ll ll lll llll ll l ll ll lll lll ll ll l l ll lll llll ll lll ll ll l ll l lll l ll l l ll ll ll ll ll l ll ll ll ll l lll l ll l l lllll llll l ll l lll lll lll l ll ll ll lll l l l lllll lll l lll ll lll lll ll l l ll l ll lll ll l l ll ll llll llll lll lll ll l lll l l ll ll ll ll l ll ll ll ll lll ll ll lll l lll ll llll l ll ll l ll lll ll llll l l ll lll llll l ll ll l llll ll l lll lll lll ll ll l ll l ll ll l ll ll l ll lll llll l ll l ll ll llll ll lll ll ll lll l ll ll ll l ll l llll lll ll ll l ll llll ll lll lll ll ll ll lll l l ll ll l ll ll ll ll l ll lll ll ll lll ll ll l lll llll l ll ll lll ll ll l ll l ll l ll l lll ll ll ll l ll ll l ll l ll l l l llll lll lll lll ll l lll l ll ll ll lll l ll l lll ll l lllll lll ll ll l l ll ll ll l ll l l llll lll lll l lll ll lll l ll lll l ll ll ll ll l ll lll lllll l ll ll ll lll llll l ll ll l l llll ll l ll l lll ll ll ll ll lll l ll l lll l ll l l ll l lll l ll lll llll l lll ll lll lll l llll l ll lll l lll lll lll ll lll l ll lll lll lll lll l ll llll l ll ll lll l ll ll l ll ll llll l l ll l lll ll lll ll ll ll lll lll llll ll lllll l l l llll lll ll ll ll lll lll l l ll ll l llll ll ll ll ll lll l ll ll l l ll ll l ll l ll lll lll l ll lll l lll l ll ll ll ll ll lll ll ll lll ll l lll lll lll l ll ll lll ll l ll l lll ll ll ll ll ll ll ll ll l ll ll llll ll ll lll lll ll ll l l l ll lll ll ll lll ll l l ll l ll ll lll l ll lll lll lll ll l lll ll llll l llll l l ll ll l l llllll lll ll ll l lll ll l ll lll l l llll ll llll lll lll llll ll l lll ll l ll ll ll llll ll lll l lll l lllll llll lll l llll ll lll lll lll l l ll ll ll l lll l lll ll l lll lll l ll l lll l ll l lll l lll ll lll lll l l l lll ll l l ll l lll ll ll lll lll l ll ll llll l lll ll ll l lll ll l llll ll llll lll l l l ll ll l lll lll lll lll ll l ll ll llll ll lll l ll lll ll l ll llll llll l ll ll l ll ll l ll l lllll ll l lll lll lll llllll ll ll ll ll lll llll l l ll l l ll l ll lll lll lll ll ll l llll ll lll ll ll ll l ll ll llll ll l lllll ll ll ll ll l ll ll ll lll ll lll lll l lll lll l ll ll lll ll ll l ll llll lll l lll ll llll ll l lll ll lllll l l lll l lll ll lll lll ll lll ll ll lll ll ll l lll ll ll lll ll ll l l lll l llll llll l l lll lll ll ll lll l lll l llll ll l llll ll l ll lll ll lll ll l ll l lll ll ll llll l ll ll l llll l lll ll l lll l ll l l ll l l l ll l ll ll l ll lll ll l ll lll lll ll ll lll ll l lll l l l l ll l ll lll ll lllll ll ll ll ll lll l llll l ll lllll l lllll lll lll lll l lll ll ll ll l ll lll l llll ll ll ll l llll l ll lllll ll ll l lll ll l lll l ll ll ll ll lll l l lll ll ll l l ll llll ll ll ll ll lll ll lll l llll l ll ll l ll lll lllll ll ll l ll l ll lll l ll ll l lll ll ll ll lll l l ll ll l ll l ll l lll lll l ll ll ll ll l l llll l ll llll ll ll ll ll l ll l ll ll ll lll lll l lll l l lll ll ll ll l l ll ll ll ll ll llll ll ll l ll ll l l l llll ll ll l l ll ll ll ll l ll lll ll ll l ll ll l lll lll l ll l ll ll ll lll l lll l lll ll l l l ll ll llll ll ll l l lll l lll lll lll ll l lll ll l l llll l ll ll lll l ll ll ll l ll ll l lll ll lll l lll l ll l lll ll ll l ll ll ll l llll lll ll ll l lll l l lll l lllll ll ll ll ll ll l lll l ll ll ll ll l l ll ll lll l ll l ll ll llll lll l ll l ll ll ll ll lll ll l ll ll lll ll lll ll ll ll lll l ll l l ll l l lll ll l llll l ll l ll ll ll ll ll l lll ll lll l llll l lll l llll ll l lll llll l ll l ll lll l l lll ll ll ll l lll ll ll llll l lll ll l ll l l ll l lll ll lll lll l l lll ll l ll l ll ll ll ll l l ll lll l ll lllll l l ll ll ll llll l ll lll l ll lllll lll l llll lll ll ll lll ll l ll l ll l l l l lll ll ll ll ll ll ll ll ll llll l ll l l lll ll ll l lllllll ll l lll l l ll l lll ll ll l l ll l ll ll ll lll ll ll ll lll l ll lll ll lll lll l ll l lll ll llll lll ll ll l ll ll l lll ll l ll ll l ll ll ll l ll ll lllll ll l ll ll llll l lll ll ll l l ll l ll ll ll ll llll l ll l ll l ll lll l ll ll ll lll ll l l lll llll l ll ll l ll llll ll ll ll ll l l ll l llll lll ll l lll ll lll l l ll ll ll ll l llll lll ll l ll ll l lll l l ll lll ll lll ll lll llll l lll ll ll ll lll ll ll llll l lll l ll ll ll ll llll ll ll l ll ll ll lll ll l ll lll lll ll ll l ll l lll ll ll lll ll l ll lll l ll ll ll ll l ll ll ll ll ll llll ll ll l ll ll ll l l ll lll l ll ll l ll ll l ll ll l lllll lll ll l l ll % % % % % % % % % % % % % % % % % % % % (c) shares: Popularity scale at 30 days Popularity percentile s ha r e s % % % % % % % % % % % % % % % % % % % % (d) tweets: Popularity scale at 30 days Popularity percentile t w ee t s % % % % % % % % % % % % % % % % % % % % (e) tweets: Popularity scale at 60 days Popularity percentile t w ee t s % % % % % % % % % % % % % % % % % % % % (f) Figure 8: (top row) The popularity scale of YouTube videos in the dataset. The total number of views obtained by each video in the first 30 days (a) and 60 days (b) after upload is divided into 40 equally spaced bins(i.e. each with 2 . 5% of the videos). The 2 . 5% most popular videos span almost two orders of magnitude in views.Note that outliers in this bin are not represented, as the most popular videos in the collection have ∼ views.(c) Evolution of popularity between 30 and 60 days. The outliers are videos that have improved significantly on thepopularity scale. (bottom row) The popularity scale on the Active dataset for shares at 30 days (d) and tweets at 30 days (e) and 60 days (f). Note that the scales for shares and tweets are very similar, with the observationthat videos in the Active set seem to receive more tweets than shares. Another observation is the difference in thepopularity scale for tweets between 30 and 60 days: the biggest change is observe in the bottom 2 . Active set receive at least 100 tweets during their life time. As a result, thebottom 2 . 5% will continue to rise with t . We provide in this subsection some observations on be-havior statistics and key parameters broken down byvideo category. Furthermore, we show how the endo-exomap can be used to detect consistent behaviors acrossYouTube channels. Consistent behavior across channels We use theendo-exo map to visualize groups of videos that belongto the same user-assigned content type, or are from thesame author, called channel in YouTube. Fig. 9 showsa scatter plot of videos posted by a reporter in category News & Activism (in red) and a user focusing on record-ings of Game sessions (in blue). The game recording videosare generally more popular (bigger circles) than the newsvideos, and this is explained by the former group havinghigher exogenous sensitivity – higher values of µ . Endogenous response A ξ E x ogenou s s en s i t i v i t y µ ●● ● ●●● ●● ●● ● ●●●●● ● ●● ●● ●●● ●●● ●●● ● ●● ●● ●●● ● ●● ● ●● VEGETTA777Anatolii Sharij .25.5 .75 1 Figure 9: Video channels on the endo-exo map. Scatterplot of videos from a reporter covering events in Ukraine(Anatolii Sharij, in red) and Spanish game recordingvideos channel (VEGETTA777, in blue). Radii of thecircles are proportional with the popularity percentile ofeach video.20 AllComedyEducation EntertainmentFilm & AnimationGaming Howto & StyleMusicNews & Politics Nonprofits & ActivismPeople & BlogsScience & Technology Sports g h m A x ^AllComedyEducation EntertainmentFilm & AnimationGaming Howto & StyleMusicNews & Politics Nonprofits & ActivismPeople & BlogsScience & Technology Sports Figure 10: (Top row) The number of views (left), shares (center) and tweets (right) for videos in different categories.(Bottom row) Box plots of unobserved exogenous influence (initial impulse γ , constant excitation η ), exogenoussensitivity µ and endogenous response A ˆ ξ , broken down by category. The effect of the external influence is not equal We examine the amount of attention (in number of views)and external influence (in number of shares and tweets)in the Active dataset. This provides a basis for un-derstanding the corresponding Hawkes intensity model.Fig. 10 (top row) contains box plots of total views, alongwith total shares and tweets, broken down by video cate-gory. The left-most boxes (in red) depicts the profile of allvideos. One notable example is videos in the Nonprofits& Activism category: overall they have less-than-averageamount of views, despite being shared more than the me-dian number of times. Observed versus unobserved external influences Model parameters γ and η can be interpreted respectivelyas the initial impulse and constant exogenous stimuli notcaptured in the observed exogenous activity s ( t ). Fromthe bottom left two plots in Fig. 10, we can see that sev-eral categories have significantly higher components of γ and η , such as Gaming , Comedy and Entertainment . Thismay result from a significant volume of activity outside ofTwitter or Youtube sharing – Gaming videos, for example,is known to spread on dedicated social networks such assub-reedit /r/gaming/ , /r/gamingvids/ Exogenous sensitivity and endogenous response The two bottom right plots of Fig. 10 represent the break-down per category of, respectively, the exogenous sensi-tivity µ and the endogenous response A ˆ ξ . These plotspresent an alternative view to the 2-dimensional densitydistribution of each category on the endo-exo map , shownin Figures 16 and 17. Certain categories, such as Comedy , Gaming or Sport seem to be particularly sensitive to ex-ternal influence. Categories like Comedy , Entertainment or Gaming observe higher then median endogenous re-sponses. The fact the Comedy and Gaming show both ahigh exogenous sensitivity and endogenous response pro-vide a plausible explanation to why these categories ob-serve relative high popularity ( ) despite their rela-tive low sharing. Conversely, Nonprofits & Activism exhibits lower than median values for both µ and A ˆ ξ which accounts for its low popularity (even though highlyshared). We study the distribution of the memory exponent θ inthe HIP model, for three categories of videos in the Ac-tive dataset. In Fig. 11, the distributions for categories Music , Nonprofits & Activism and News & Politics %5%10%15%20%25%30% 0 5 10 15 20 D e n s i t y AllMusic q : Music D e n s i t y AllNonprofits & Activism q : Nonprofits & Activism D e n s i t y AllNews & Politics q : News & Politics Figure 11: Distribution of the memory exponent θ for 3 categories: Music (a), Nonprofits & Activism (b) and News & Politics (c), compared to the background distribution in all videos. Solid vertical lines indicate the median θ value in each population, whereas the dashed vertical lines indicate the mean θ . Color of lines corresponds tolegend. The densities are obtained using kernel density estimation.(in red) are contrasted with the distribution from All thevideos (in blue). The solid lines in each graph indicate themedian value for θ in each category, whereas the dashedlines indicate the mean value. All video categories, as wellas the general population, observe a long tail distributionfor θ , with a peak density around θ ’ . 35. A small θ leads to slower decay over time (and larger endogenousresponse A ˆ λ ), whereas a large θ means an video is quickerto be forgotten (i.e. small A ˆ λ ). We can see that a larger(than random) fraction of Music videos decay slowly(mean θ,all = 15 . 94, mean θ,music = 14 . News & Politics and Nonprofits & Activism videosare forgotten faster, with mean θ,nonprofit = 17 . 56 andmean θ,news = 19 . 45. This suggests that there is a system-atic difference across different types of videos in the rateat which the collective memory decays – one explanationfor such differences can be that music is typically con-sidered timeless content while news is considered timely whose relevance decreases rapidly over the first few days. In this section, we provide additional details about therelation between video popularity and fitted values of pa-rameters µ and θ . These analyses provide additional de-tails to the endo-exo map, by explicitly linking the en-dogenous and exogenous components of each video toeach model parameter. Parameters µ and θ and popularity In the maintext, we claim a direct connection between µ the exoge-nous sensitivity and popularity and an inverse connectionbetween the θ the time-decay rate of the memory ker-nel and the popularity. We provide, in Fig. 12, empiricalproof of these connections by studying the popularity dis-tribution for low and high values of the above parameters.The top-left graphic shows the density distribution of thefitted values of µ in the Active dataset. There is a higha peak of density around µ = 1, corresponding to videoswith low sensitivity to external influence, and a second peak around µ = 10 . = 53 . 7, corresponding to videoswith higher exogenous sensitivity. We divide the rangeof µ into deciles (groups of 10% each) and we select thesecond decile (i.e. low sensitivity) and the tenth decile(high sensitivity), hashed in gray on the graphic. In thebottom-left graphic, we plot the popularity distributionfor videos within each of the above deciles of µ . Thesubpopulation of videos with low exogenous sensitivityshow a dense area of low popularity, and with only veryfew videos making it into the top popularity percentiles.Conversely, the density distribution of the subpopulationof videos with high exogenous sensitivity shows an in-creasing trend, with a concentration of highly popularvideos. This confirms the intuition that highly popularvideos tend to have high values of exogenous sensitivity µ . Similar results can be shown for the time-decayingmemory exponent θ , which controls how fast videos are forgotten and the size of the endogenous response A ˆ λ .Fig. 12b plots the density distribution of θ , which showsa peak at θ = 3 . 36 and selects the second and tenth per-centile, corresponding respectively to low values and highvalues of θ . Similarly to µ , the bottom-center graphicplots the popularity distribution for each of the subpop-ulations defined by the selected deciles of θ . The sub-population with high values of θ (i.e. low A ˆ λ ) tends tobe forgotten more quickly and shows a concentration ofvideos with low popularity, whereas videos with lower val-ues of θ (and higher A ˆ λ ) tend to be more popular. Endo-exo map for additional categories The aboveconsiderations are at the basis of the construction of the endo-exo map , as shown in the main text and its poten-tially viral region – videos with high values of both exoge-nous sensitivity µ and endogenous response A ˆ λ are moresusceptible to become popular if given the required atten-tion . The right column of Fig. 12 plots the 2D density ofvideos on the endo-exo map for the entire Active dataset(top) and the top 5% most popular videos (the color mapis aligned for the two graphics). Visibly, the distribution22 : density plot and selected deciles - - D en s i t y (a) q : density plot and selected deciles D en s i t y (b) Density distribution: all vids Endogenous response A ξ E x ogenou s s en s i t i v i t y µ − (c) Popularity percentile density per type of µ D en s i t y 0% 20% 40% 60% 80% 100%Low µ High µ (d) Popularity percentile density per type of θ D en s i t y 0% 20% 40% 60% 80% 100%Low θ High θ (e) Density distribution: top 5% Endogenous response A ξ − (f) Figure 12: Density distribution of fitted model parameters values µ (left column) and θ (center column). The rangeof each fitted parameter is divided into 10 deciles, shown by vertical dotted gray lines (in the top row). For eachof parameters µ and θ , 2 deciles are chosen (one corresponding to high values, and a second one corresponding tolow values), shown shaded in gray. For each parameter, the bottom row plots the popularity distribution for videoswithin each of the chosen deciles. Right column: the endo-exo map µ and high A ˆ ξ ).of the popular videos is skewed towards the more viralregion of the map (i.e. high µ and high A ˆ λ ). In Fig. 16and 17, we repeat this analysis and we further break downthe Active population, based on video category. Weplot pairs of 2D densities of videos on the endo-exo mapfor all categories, except Gaming and Film & Animation ,which were discussed in the main text Fig. 4. This visu-ally reveals some of the dynamics that propel videos tothe most popular segment, in each subpopulation. Forexample, categories like Gaming , Science & TechnologyTravel & Events have the distribution of the most pop-ular videos shifted top-right w.r.t to rest of the cate-gory (similar to the dynamics shown for the entire pop-ulation). Other categories appear only upward-shifted(i.e. only higher µ ) w.r.t to rest of the category: Film& Animation , Entertainment , Howto & Style , News &Politics and People & Blogs . There is even an outliercategory, Comedy , which seems to have two heat centersin the top 5% popular subpopulation. This seems to in-dicate two distinct patterns of becoming popular withinthis category: one pattern involves being sensitive to ex-ogenous excitation more than the average video, whereas the second pattern involves higher endogenous propaga-tion in the network (higher A ˆ λ ). In this section, we investigate the causal connection be-tween the series of views, shares and tweets, for each ofthe videos in the Active dataset. We test for causal-ity using time-series analysis tools. We employ a F-typeGranger-causality test [4], implemented in the R pack-age vars [13]. For each series of each video we constructa Vector Autoregressive Model, with the lag determinedautomatically using minimal AIC. Next, we perform aGranger causality test for each video and each pair oftemporal series – i.e. (views, shares), (views, tweets) and(tweets, shares). Each test is performed in both directions– e.g. views Granger-cause shares and shares Granger-cause views. The null hypothesis is that no causal relationexists between the series. We reject the null hypothesisand we accept the existence of a causal relation when weobserve a test p-value lower than 10 − .23able 2: Granger causality test: number of videos in the Active set for which the null hypothesis (i.e. absenceof Granger causality) is rejected with p < . Active dataset (around 14 thousandsvideos), a causal relation is detected for no more than6% of the videos – i.e. for the relation views Granger-cause shares, true for 833 videos. For all pairs of se-ries, the number of videos presenting a unidirectional re-lation seems comparable (e.g. tweets Granger-cause viewsfor 164 videos, and shares Granger-cause tweets for 162videos). We cannot identify a clear Granger causality re-lation between different series. As there does not yet exista standard method for capturing non-linear causal rela-tionships or causality with confounding effects, we leavethis as future work. In this section we provide additional details and results tocomplete the analysis in the main text Sec. 5. Namely, weprovide more information about the performance breakdown of different approaches and the statistical testinganalysis for detecting statistically significant differencesin the forecasting performance.The series of the first 90 days of each video history in Active dataset are used to fit the Hawkes intensity modelparameters. The series is divided into two sub-series: thefirst 75 days are used to fit parameters { µ, θ, C, c } , whilethe last 15 days for the holdout series used to fit the reg-ularizer meta-parameter ω . Either and series can serve as the known exogenous stimuli series s ( t ). The Multi-Linear Regression (MLR) [14] baseline is trained using the same data. We adapt the original algo-rithm by predicting the value of the viewcounts for eachof the 30 days between day 91 and 120. Furthermore, webuild an enhanced version (denoted by MLR ( ) or MLR ( ) ) by introducing the exogenous influ-ence as additional variables, both in the training and inthe prediction. The baseline is particularly sensitive tooutliers, which we remove from the training set. A videois considered an outlier if it has received a large burst ofviews in the period from 91 to 120 days. More precisely,we remove any video having received twice as many viewbetween days 91 and 120 than then do between 61 and 90.3.5% of the videos are considered outliers and eliminatedfrom the training set. The errors are measured in averageerror in popularity percentile, as defined in the main text. In addition to the performance comparison, shown in themain text Fig. 5, Fig. 13(a) presents the Cumulative Dis-tribution Function (CDF) of the prediction errors for theHIP, MLR and MLR ( ). HIP consistently outper-forms MLR (with and without the exogenous stimuli in-formation): HIP forecasts popularity of 87% of the videopopulation with a maximum 10% error, while MLR cov-ers only 78% of the population for the same error thresh-old. Furthermore, MLR ( ) obtains only marginalperformance improvements over MLR, even while usingthe exogenous information. Fig. 13(b) shows the ab-solute forecasting error performances, aggregated usingbarplots. Visibly, the HIP (using either and ) consistently outperforms MLR both in term ofmedian values and variation, which results in the bet-ter mean values of forecasting error already shown in themain text Fig. 5(center). Fig. 13(c) analyzes closely theforecasting error distribution for the best performing ver-sion of each approach. The HIP ( . We ask whether or not the performance differences ob-served in main text Fig. 5(b) are significant, or are theydue to chance. We break down the study into two ques-tions: 1) is the difference of forecasting performance be-tween the Hawkes intensity process and the MLR baselinestatistically significant? and 2) does the source of exoge-nous stimuli – or – influence the qualityof the forecasting? We setup four comparisons: two com-paring the performances of forecasting methods for eachof the two sources of influence and two comparing the ef-fect of the sources for each of the two algorithms. Based24 ercentage of covered videos with error rate Hawkes ( P e r c en t age o f v i deo s (a) Active set: boxplot of error of popularity forecasts A b s o l u t e pe r c en t il e e rr o r Hawkes( (b) Popularity forecasting error E rr o r d e n s i t y HIP ( Active set: error distribution of popularity forecasts (c) Figure 13: Performance comparison graphics, additional to main text: number of covered videos, when accepting agiven error percentage (left) and barplots of absolute forecasting error (right). (bottom) Absolute forecasting errordistribution for Hawkes ( ) (blue curve) and MLR ( ) (red curve). Median values are represented foreach approach with gray vertical lines: 3% for Hawkes and 3 . 75% for MLR.on the selected setup, we construct two samples and per-form a T-test. For each test, the null hypothesis assumesthat the mean of two samples are equal (i.e. there is nodifference in forecasting performance). The alternativehypothesis assumes that the true means of two samplesare not equal. Note that these statistical tests are notconnected to particular models that were used to obtainthe estimates, i.e. it does not matter if the estimatedcome from the multivariate linear regression or from theHawkes intensity process. Statistical testing with large sample size The well-known p-value issued from hypothesis testing is depen-dent on the observed difference between the two samples,as well as the sample size. This renders analyses basedonly on p-value particularly sensitive to sample size, giventhat with sufficiently large samples, a statistical test willalmost always demonstrate a significant difference [15].Given the size of the Active dataset (i.e. 13 , effect size in addition to the typical p-value.The effect size measure which we report and we use tojustify our analysis is Cohen’s d coefficient [2], defined asthe difference of the means scaled by the inverse of thestandard deviation of both populations Evaluation setup Our forecasting systems uses twoinputs: the observed and an external stimulisource (either or ). Answering question1) – significance of performance difference between theHawkes intensity model and the MLR baseline – boilsdown to comparing two treatments for a single set of in-dividuals. This translates into applying a paired T-test toa single sample. Conversely, comparing the two sourcesof exogenous excitation involves applying the same fore-casting method to two different populations. This leadsto applying a two-sample T-test . No performance difference between the two ex-ogenous stimuli sources The detailed results of eachof the four hypothesis testings are presented in Tab 3.The first two lines correspond to the two-sample T-tests,dealing with the difference between the two sources ofexogenous stimuli. The first test uses HIP as forecast-ing algorithm, the second one uses the MLR baseline.For both tests, the Cohen’s d coefficient has a negligi-ble value. This suggests that no significant differenceexists when forecasting future popularity using or as external stimuli. The very small p-valueyielded in the first test ( ∼ − ) is most likely an artifactof the large sample size. These two results indicate that25able 3: Summary of comparison for forecasting performance differences. Each line corresponds to a performedT-test, either comparing two exogenous sources or two forecasting algorithms. Columns “Sample A ” and “Sample B ” describe the two compared samples in terms of used algorithm and exogenous source, mean value and standarddeviation. The first two tests are two sample T-tests , whereas the last two lines correspond to paired T-tests (moredetails about the underlying rationale in the text). M denotes sample mean, SD standard deviation. Exogenoussource Forecastingalgorithm Sample A Sample B T-test p-val Cohen’s d vs. HIP HIP ( ) M = 4 . × − , SD = 6 . × − HIP ( ) M = 5 . × − , SD = 6 . × − . × − − . vs. MLR MLR ( ) M = 6 . × − , SD = 9 . × − MLR ( ) M = 6 . × − , SD = 9 . × − . 98 0 . HIP vs. MLR HIP ( ) M = 4 . × − , SD = 6 . × − MLR ( ) M = 6 . × − , SD = 9 . × − . × − . HIP vs. MLR HIP ( ) M = 5 . × − , SD = 6 . × − MLR ( ) M = 6 . × − , SD = 9 . × − . × − . −1.0 −0.5 0.0 0.5 1.00246 Pearson’s r correlation coefficient D en s i t y Distribution of correlation coefficientbetween Min. = −1.001st Qu. = 0.67Median = 0.87Mean = 0.783rd Qu. = 0.96Max. = 1.00 Figure 14: Density distribution of Pearson’s r correlationcoefficient between and for each videoin the Active dataset. The legend values give the quar-tiles and the mean of the obtained coefficients.the same forecasting performance is achieved using eithersource of external stimuli, regardless of the forecasting al-gorithm. Furthermore, a correlation analysis between thetwo sources reveals the same conclusion: for each video wecompute the Pearson’s correlation coefficient between thetwo time series and . Fig. 7 shows thedensity distribution of the correlation coefficient. Visibly,the two series are highly correlated for most videos. Differences between HIP and MLR The last twolines of Tab. 3 show the results of testing the performancesof the HIP model against the baseline, for each of thetwo external sources. In both cases, the Cohen’s d coef-ficient shows a small, but non-negligible effect size. To-gether with the very low p-values ( ∼ − for , ∼ − for ), this makes us reject the null hy-pothesis and conclude that the differences between thetwo forecasting methods are statistically significant. The difference of forecasting performance is even morenoteworthy for more difficult videos – those that presenta large exogenous shock in the forecasting period. Areal example of such a video is depicted in Fig. 15(a).A video is considered to present a high exogenous shockif the exogenous stimuli series s ( t ) , t ∈ Test containsat least one point t so that s ( t ) > mean ( s ( t ∗ )) +10 var ( s ( t ∗ )) , with t ∗ ∈ Train . 4006 videos in the Ac-tive dataset present a high exogenous shock in the testingperiod. MLR, even in the presence of known informationabout the external stimuli, largely misses the predictions(as shown in Fig. 15(b)). HIP achieves levels of forecast-ing performance on the high exogenous dataset similar tothe entire Active dataset. Fig. 15(c) shows the distribu-tion of absolute forecasting error for Hawkes ( )and MLR ( ). Compared to Fig. 13(c), the HIPmodel achieves smaller errors, with a median value of3 . . References [1] Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-LawDistributions in Empirical Data. SIAM Review 51(4),661–703 (Nov 2009)[2] Cohen, J.: Statistical Power Analysis for the BehavioralSciences. Lawrence Erlbaum Associates, Hillsdale, NJ,2nd edn. (1988)[3] Crane, R., Sornette, D.: Robust dynamic classes revealedby measuring the response function of a social system.Proceedings of the National Academy of Sciences 105(41),15649–15653 (2008)[4] Granger, C.W.: Investigating causal relations by econo-metric models and cross-spectral methods. Econometrica:Journal of the Econometric Society pp. 424–438 (1969) v i e w s Video 8QWg0YrcFhI: example from the exogenous set − − 22 2014 − − 20 2014 − − 17 2014 − − 14 2014 − − N u m be r o f s ha r e s Observed (a) Exo. vids: boxplot of error of popularity forecasts A b s o l u t e pe r c en t il e e rr o r Hawkes( (b) Popularity forecasting error E rr o r d e n s i t y HIP ( Exo. vids: error distribution of popularity forecasts (c) Figure 15: Forecasting performance for the Hawkes intensity model and MLR on videos presenting a high exogenousshock. (left) Depiction of a video having a high exogenous shock during the testing period. It is a relatively obscurevideo about Brazilian politics, which suddenly received a considerable amount of attention in October/November2015 (more than 3 months after its upload), only to slide back into obscurity after December 2015. (right) Barplotaggregation of performances of Hawkes intensity and MLR (with different sources of external stimuli), in termsof absolute forecasting error. (bottom) Absolute forecasting error distribution for the exogenous set for Hawkes( ) (blue curve) and MLR ( ) (red curve). Median values are shown with gray vertical dotted lines(3 . 25% for Hawkes and 6 . 5% for MLR). [5] Hawkes, A.G.: Spectra of some self-exciting and mutuallyexciting point processes. Biometrika 58(1), 83–90 (Apr1971)[6] Helmstetter, A., Sornette, D.: Subcritical and supercriti-cal regimes in epidemic models of earthquake aftershocks.Journal of Geophysical Research: Solid Earth 107(B10),ESE 10–1–ESE 10–21 (2002)[7] Johnson, S.G.: The NLopt nonlinear-optimization pack-age (2011)[8] Liptser, R.S., Shiryaev, A.N.: Statistics of Random Pro-cesses. Springer Berlin Heidelberg, Berlin, Heidelberg(2001)[9] Liu, D.C., Nocedal, J.: On the limited memory BFGSmethod for large scale optimization. Mathematical Pro-gramming 45(1-3), 503–528 (aug 1989)[10] Miritello, G., Lara, R., Cebrian, M., Moro, E.: Limitedcommunication capacity unveils strategies for human in-teraction. Nature Scientific Reports 3 (Jun 2013)[11] Mishra, S., Rizoiu, M.A., Xie, L.: Feature Driven andPoint Process Approaches for Popularity Prediction. In:Proceedings of International Conference on Information and Knowledge Management - CIKM ’16. p. 10. Indi-anapolis, USA (2016)[12] Ogata, Y.: Seismicity Analysis through Point-processModeling: A Review. Pure and Applied Geophysics155(2-4), 471–507 (1999)[13] Pfaff, B.: Var, svar and svec models: Implementationwithin R package vars. Journal of Statistical Software27(4) (2008)[14] Pinto, H., Almeida, J.M., Gon¸calves, M.A.: Using earlyview patterns to predict the popularity of youtube videos.In: Proceedings of the sixth ACM international confer-ence on Web search and data mining - WSDM ’13. p. 365.ACM Press, New York, New York, USA (Feb 2013)[15] Sullivan, G.M., Feinn, R.: Using Effect Size-or Why theP Value Is Not Enough. Journal of graduate medical ed-ucation 4(3), 279–82 (sep 2012)[16] Zhao, Q., Erdogdu, M.A., He, H.Y., Rajaraman, A.,Leskovec, J.: Seismic: A self-exciting point process modelfor predicting tweet popularity. In: Proceedings of the21th ACM SIGKDD International Conference on Knowl-edge Discovery and Data Mining. pp. 1513–1522. ACM(2015) utos & Vehicles: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Autos & Vehicles: top 5% Endogenous response A x - (a) Comedy: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Comedy: top 5% Endogenous response A x - (b) Education: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Education: top 5% Endogenous response A x - (c) Entertainment: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Entertainment: top 5% Endogenous response A x - (d) Howto & Style: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Howto & Style: top 5% Endogenous response A x - (e) Music: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Music: top 5% Endogenous response A x - (f) News & Politics: all Endogenous response A x E x ogenou s s en s i t i v i t y m - News & Politics: top 5% Endogenous response A x - (g) Nonprofits & Activism: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Nonprofits & Activism: top 5% Endogenous response A x - (h) Figure 16: Pairs of 2-dimensional densities of videos on the endo-exo map, for each of the video categories in Active dataset (except for Gaming and Film & Animation , already presented in the Main Text Fig. 4). For each pair, theleft heatmap represents the density distribution of all videos in the category, while the right heatmap shows thedistribution of the most popular 5% videos in the category. 6 more categories are shown in Fig. 17.28 eople & Blogs: all Endogenous response A x E x ogenou s s en s i t i v i t y m - People & Blogs: top 5% Endogenous response A x - (a) Pets & Animals: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Pets & Animals: top 5% Endogenous response A x - (b) Science & Technology: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Science & Technology: top 5% Endogenous response A x - (c) Shows: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Shows: top 5% Endogenous response A x - (d) Sports: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Sports: top 5% Endogenous response A x - (e) Travel & Events: all Endogenous response A x E x ogenou s s en s i t i v i t y m - Travel & Events: top 5% Endogenous response A x - (f)(f)