[PDF] Effective Measure of Endogeneity for the Autoregressive Conditional Duration Point Processes via Mapping to the Self-Excited Hawkes Process

Abstract

In order to disentangle the internal dynamics from exogenous factors within the Autoregressive Conditional Duration (ACD) model, we present an effective measure of endogeneity. Inspired from the Hawkes model, this measure is defined as the average fraction of events that are triggered due to internal feedback mechanisms within the total population. We provide a direct comparison of the Hawkes and ACD models based on numerical simulations and show that our effective measure of endogeneity for the ACD can be mapped onto the "branching ratio" of the Hawkes model.

Full PDF

aa r X i v : . [ q -f i n . S T ] J un Eﬀective measure of endogeneity for the AutoregressiveConditional Duration point processes via mapping to theself-excited Hawkes process

V. Filimonov, ∗ S. Wheatley, † and D. Sornette ‡ Department of Management, Technology and Economics,ETH Z¨urich, Scheuchzerstrasse 7, CH-8092 Z¨urich, Switzerland (Dated: October 31, 2018)In order to disentangle the internal dynamics from exogenous factors within theAutoregressive Conditional Duration (ACD) model, we present an eﬀective measureof endogeneity. Inspired from the Hawkes model, this measure is deﬁned as theaverage fraction of events that are triggered due to internal feedback mechanismswithin the total population. We provide a direct comparison of the Hawkes andACD models based on numerical simulations and show that our eﬀective measure ofendogeneity for the ACD can be mapped onto the “branching ratio” of the Hawkesmodel.

I. INTRODUCTION

An outstanding challenge in socio-economic systems is to disentangle the internal dynam-ics from the exogenous inﬂuence. It is obvious that any non-trivial system is both subject toexternal shocks as well as to internal organizational forces and feedback loops. In absenceof external inﬂuences, many natural and social systems would regress or die, however theinternal mechanisms are of no less importance and can either stabilize or destabilize thesystem. These systems are continuously subjected to external shocks, forces, noises andstimulations; they propagate and process these inputs in a self-reﬂexive way. The stability(or criticality) of these dynamics is characterized by the relative strength of self-reinforcing ∗ Electronic address: vﬁ[email protected] † Electronic address: [email protected] ‡ Electronic address: [email protected] mechanisms.For instance, the brain development and performance is given by both external stimuli andendogenous collective and interactive wiring between neurons. The normal regime of braindynamics corresponds to asynchronous ﬁring of neurons with relatively low coupling betweenindividual neurons. However as the coupling strength increases, the internal feedback loopsstarts playing an increasingly important role in the dynamics, and the system moves towardsthe tipping point at which abnormal synchronous “neuronal avalanches” result in an epilepticseizure [77]. As another example, ﬁnancial systems are known to be driven by exogenousidiosyncratic news that are digested by investors and complemented with quasi-rational(sometimes self-referential) behavior. Correlated over-expectations (herding) of investorscorrespond to the bubble phase that pushes the system towards criticality, where the crashmay result as a bifurcation towards a distressed regime [78].In physical systems at thermodynamic equilibrium, the so-called ﬂuctuation-dissipationtheorem relates quantitatively the response of the system to an exogenous (and instanta-neous) shock to the correlation structure of the spontaneous endogenous ﬂuctuations [86].In out-of-equilibrium systems, the existence of such relation is still an open question [68]. Ina given observation set, it seems in general hopeless to separate the contributions resultingfrom external perturbations and internal ﬂuctuations and responses. However, one wouldlike to understand the interplay between endogeneity and exogeneity (the ‘endo-exo’ prob-lem, for short) in order to characterize the reaction of a given system to external inﬂuences,to quantify its resilience, and explain its dynamics. Using the class of self-exciting condi-tional Poisson (Hawkes) processes [42, 43], some progress has recently been made in thisdirection [15, 18, 79, 80].In the modeling of complex point processes in natural and socio-economic systems, theHawkes process [42, 43] has become the gold standard due to its simple construction andﬂexibility. Nowadays, it is being successfully used for modeling sequences of triggered earth-quakes [62]; genomic events along DNA [67]; brain seizures [63, 82]; spread of violence [52]and crime [56] across some regions; extreme events in ﬁnancial series [19] and probabilitiesof credit defaults [25]. In ﬁnancial applications, the Hawkes processes are most actively usedfor modeling high frequency ﬂuctuations of ﬁnancial prices (see for instance [4, 7, 11, 32]),however applications to lower frequency data, such as daily, are also possible (see A).Being closely related to branching processes [39], the Hawkes model combines, in a naturaland parsimonious way, exogenous inﬂuences with self-excited dynamics. It accounts simul-taneously for the co-existence and interplay between the exogenous impact on the systemand the endogenous mechanism where past events contribute to the probability of occur-rence of future events. Moreover, using the mapping of the Hawkes process onto a branchingstructure, it is possible to construct a representation of the sequence of events according toa branching structure, with each event leading to a whole tree of oﬀspring.The linear construction of the Hawkes model allows one to separate exogenous eventsand develop a single parameter, the so-called “branching ratio” η that directly measures thelevel of endogeneity in the system. The branching ratio can be interpreted as the fractionof endogenous events within the whole population of events [32, 45]. The branching ratioprovides a simple and illuminating characterization of the system, in particular with respectto its fragility and susceptibility to shocks. For η <

1, on average, the proportion 1 − η ofevents arrive to the system externally, while the proportion η of events can be traced back tothe inﬂuence of past dynamics. As η approaches 1 from below, the system becomes “critical”,in the sense that its activity is mostly endogenous or self-fulﬁlling. More precisely, its activitybecomes hyperbolically sensitive to external inﬂuences. The regime η > η , but it also amenable to an easy and transparent estimation by maximumlikelihood [60, 64] without requiring stochastic declustering [54, 93], which is essential in thebranching processes’ framework but has several limitations [84].However, the Hawkes model is not the only model that describes self-excitation in pointprocesses. In particular, the Autoregressive Conditional Duration (ACD) model [22, 23]and the Autoregressive Conditional Intensity (ACI) model [69] have been introduced andsuccessfully used in econometric applications. A similar concept was used in the so-called Au-toregressive Conditional Hazard (ACH) model [37]. These processes were designed to mimicproperties of the famous Autoregressive Conditional Heteroskedasticity (ARCH) model [20]and Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model [8] thatsuccessfully account for volatility clustering and self-excitation in price time series. Someother modiﬁcations of ACD models such as Fractionally Integrated ACD (FIACD) [47] orAugmented ACD (AACD) [29] were introduced to account for additional eﬀects (such aslong memory) or to increase the ﬂexibility of the model (for a more detailed review, see [7]and references therein).In general, all approaches to modeling self-excited point processes can be separated intothe classes of Duration-based (represented by the ACD model and its derivations) andIntensity-based approaches (Hawkes, ACH, ACI, and so on), which deﬁne a stochastic ex-pression for inter-event durations and intensity respectively. Of all the models, as discussedabove, the Hawkes process dominates by far in the class of intensity-based model, and theACD model – a direct oﬀspring of the GARCH-family – is the most used duration-basedmodel.Despite belonging to diﬀerent classes, both models describe the same phenomena andexhibit similar mathematical properties. In this article, we aim to establish a link betweenthe ACD and Hawkes models. We show that, despite the fact that the ACD model cannotbe directly mapped onto a branching structure, and thus the branching ratio for this modelcannot be derived, it is possible to introduce a parameter ζ ∈ [0 ,

1] that serves as aneﬀective degree of endogeneity in the ACD model. We show that this parameter sharesimportant properties with the branching ratio η ∈ [0 ,

1] in the framework of the Hawkesmodel. Namely, both ζ and η characterize stationarity properties of the models, and providean eﬀective transformation of the exogenous excitation of the system onto its total activity.By numerical simulations, we show that there exists a monotonous relationship betweenthe parameter ζ of the ACD model and the branching ratio η of the corresponding Hawkesmodel. In particular, the purely exogenous case ( η = 0) and the critical state ( η = 1) areexactly mapped to the corresponding values ζ = 0 and ζ = 1. We validate our results bygoodness-of-ﬁt tests and show that our ﬁndings are robust with respect to the speciﬁcationof the memory kernel of the Hawkes model.The article is structured as follows. In section II, we introduce the Hawkes and ACDmodels and brieﬂy discuss their properties. Section III introduces the branching ratio andrelates it to the measure of endogeneity within the framework of the Hawkes model. Insection IV, we discuss similarities between the Hawkes and ACD models, and identify aparameter in the ACD model that can be treated as an eﬀective degree of endogeneity.We support our thesis with extensive numerical simulations and goodness-of-ﬁt tests. Insection V, we conclude. II. MODELS OF SELF-EXCITED POINT PROCESSES

Let us deﬁne a univariate point process of event times { t i } i ∈ N > ( t i > t j for i > j ) withthe counting process { N ( t ) } t ≥ = max( i : t i ≤ t ), and the duration process of inter-eventtimes { δt i } i ∈ N > = t i − t i − . Properties of the point process { t i } are usually described withthe (unconditional) intensity process λ ( t ) = lim h ↓ h Pr[ N ( t + h ) − N ( t ) >

0] and conditionalintensity process λ ( t |F t − ) = lim h ↓ h Pr[ N ( t + h ) − N ( t ) > |F t − ], which is adapted to thenatural ﬁltration F t − = ( t , . . . , t i : t < t i ) representing the history of the process.The well-known Poisson point process is deﬁned as the point process whose conditionalintensity does not depend on the history of the process and is constant: λ ( t |F t − ) ≡ λ ( t ) = λ > , (1)The non-homogenous Poisson process extends expression (1) to account for time-dependenceof both conditional and unconditional intensity functions: λ ( t |F t − ) ≡ λ ( t ) = λ ( t ) > { δt i } are independent from each other and are completelydetermined by the exogenous parameter (function) λ ( t ).The Self-excited Hawkes process and

Autoregressive Conditional Durations (ACD) model,which are described in this article, extend the concept of the Poisson point processes byadding path dependence and non-trivial correlation structures. These models represent twodiﬀerent approaches in modelling point processes with memory: the so called intensity-based and duration-based approaches. As follows from their names, the ﬁrst approach focuseson models for the conditional intensity function λ ( t |F t − ) and the second considers modelsof the durations { δt i } . For example, in the context of the intensity-based approach, thePoisson process is deﬁned by equation (1). In the context of the duration-based approach,the Poisson process is deﬁned as the point process whose durations { δt i } are independentand identically distributed (iid) random variables with exponential probability distributionfunction f ( δt ) = λ exp( − λ δt ). A. Hawkes Model

The linear Hawkes process [42, 43], which belongs to the class of intensity-based models,has its conditional intensity λ ( t |F t − ) being a stochastic process of the following general form: λ ( t |F t − ) = µ ( t ) + Z t −∞ h ( t − s ) dN ( s ) , (2)where µ ( t ) is the background intensity , which is a deterministic function of time that accountsfor the intensity of arrival of exogenous events (not dependent on history). A determinis-tic kernel function h ( t ), which should satisfy causality ( h ( t ) = 0 for t < endogenous feedback mechanism (memory of the process). Given that each event arrivesinstantaneously, the diﬀerential of the counting process dN ( t ) can be represented in theform of a sum of delta-functions dN ( t ) = P t i

0. We introduce a newdimensionless parameter, η = aτ , which will be discussed in detail later, which allows us towrite the ﬁnal expression for the conditional intensity as follows: λ ( t |F t − ) = µ + ητ X t i

1. In order to check the robustness of the resultspresented below, in particular with respect to the choice of the memory kernel, we have alsoconsidered a power law kernel (5) with time-independent background intensity µ ( t ) ≡ µ > ∞ of the memory kernel deﬁnesthe dimensionless parameter η = Kc − ϕ / ( ϕ − λ ( t |F t − ) = µ + ηc − ϕ ( ϕ − X t i

The class of

Autoregressive Conditional Durations (ACD) models has been introducedby [22, 23] in the ﬁeld of econometrics to model ﬁnancial data at the transaction level.The ACD model applies the ideas of the Autoregressive Conditional Heteroskedasticity(ARCH) [20] model, which separates the dynamics of a stationary random process intoa multiplicative random error term and a dynamical variance that regresses the past valuesof the process. In the spirit of the ARCH, the ACD model is represented by the durationprocess δt i in the form δt i = ψ i ǫ i , (9)where ǫ i deﬁnes an iid random non-negative variable with unit mean E[ ǫ i ] = 1, and thefunction ψ i ≡ ψ ( N ( F t − ); θ ) is the conditional expected duration: E[ δt i |F t − ] = ψ i . Here, θ represents the set of parameters of the model. From expression (9), one can simply derivethe conditional intensity of the process [23]: λ ( t |F t − ) = λ ǫ (cid:18) t − t N ( t ) ψ N ( t )+1 (cid:19) ψ N ( t )+1 , (10)where λ ǫ ( s ) represents the intensity function of the noise term, ǫ i . Assuming ǫ i to be iidexponentially distributed, one can call this model (9) an Exponential ACD model.The conditional expected duration ψ ( N ( F t − ); θ ) of the ACD( p , q ) model, where ( p, q )denotes the order of the model, is deﬁned as an autoregressive function of the past observeddurations δt i and the conditional durations ψ i themselves: ψ i = ω + p X j =1 α j δt i − j + q X k =1 β k ψ i − j , (11)where ω > α j ≥ β k ≥ θ = { ω, α , . . . , α p , β , . . . , β p } . The stationarity condition for the ACD model has the form [23]: p X j =1 α j + q X k =1 β k < . (12)In the simple ACD(1,1) case that is considered in the present article, equation (11) isreduced to: ψ i = ω + αδt i − + βψ i − . (13)Similarly, the conditional intensity (10) of the Exponential ACD(1,1) has the form: λ ( t |F t − ) = 1 ψ N ( t )+1 = 1 ω + αδt N ( t ) + βψ N ( t ) (14)and the stationarity condition (12) reduces to α + β < III. THE BRANCHING RATIO AS A MEASURE OF ENDOGENEITY IN THEHAWKES MODEL

The linear structure of the Hawkes process (3) with identical functional form of summands h ( t − t i ), that depend only on arrival time of a single event t i , allows one to consider it asa cluster process in which the random process of cluster centers { t ( c ) i } i ∈ N > is the Poissonprocess with rate µ ( t ). All clusters associated with centers { t ( c ) i } are mutually independentby construction and can be considered as a generalized branching process [44], illustrated inﬁgure 1.[Insert Figure 1 here]In this context, each event t i can be either an immigrant or a descendant . The rate ofimmigration is determined by the background intensity µ ( t ) and results in an exogenousrandom process. Once an immigrant event occurs, it generates a whole cluster of events.Namely, a zeroth-order event (which we will call the mother event ) can trigger one or moreﬁrst-order events ( daughter events ). Each of these daughters, in turn, may trigger sev-eral second-order events (the grand-daughters of the initial mother), and so on. All ﬁrst-,second- and higher-order events form a cluster and are called descendants (or aftershocks )and represent endogenously driven events that appear due to internal feedback mechanismsin the system. It should be noted that this mapping of the Hawkes process (3) onto thebranching structure (ﬁgure 1) is possible due to the linearity of the model, and is not validfor nonlinear self-excited point processes, such as the class of nonlinear mutually excitedpoint processes [12], of which the Multifractal stress activation model [83] is a particularimplementation.The crucial parameter of the branching process is the branching ratio ( n ), which is deﬁnedas the average number of daughter events per mother event. Depending on the branchingratio, there are three regimes: (i) sub-critical ( n < critical ( n = 1) and (iii) super-critical or explosive ( n > t , the process dies out with probability 1 in the sub-critical and critical regimes and has aﬁnite probability to explode to an inﬁnite number of events in the super-critical regime. Thecritical regime for n = 1 separates the two main regimes and is characterized by power lawstatistics of the number of events and in the number of generations before extinction [71]. For n ≤

1, the process is stationary in the presence of a Poissonian or more generally stationaryﬂux of immigrants.Being the parameter that describes the clustering structure of the branching process,the branching ratio n deﬁnes the relative proportion of exogenous events (immigrants) andendogenous events (descendants or aftershocks). Moreover, in the sub-critical regime, in thecase of a constant background intensity ( µ ( t ) = µ = const), the branching ratio is exactlyequal to the fraction of the average number of descendants in the whole population [32,45]. In other words, the branching ratio is equal to the proportion of the average numberof endogenously generated events among all events and can be considered as an eﬀectivemeasure of endogeneity of the system.To see this, let us count separately the rates of exogenous and endogenous events. Therate of exogenous immigrants (zeroth-order events) is equal to the background activity rate: R exo = µ . Each immigrant independently gives birth, on average, to n daughters and thus0the rate of ﬁrst-order events is equal to r = µn . In turn, each ﬁrst-order event produces,on average, n second-order events, whose rate is equal to r = nr = µn . Continuing thisprocess ad inﬁnitum and summing over all generations, we obtain the rate of all endogenousdescendants: R endo = ∞ X i =1 r i = µ ∞ X i =1 n i = µn − n , (15)which is ﬁnite for n <

1. The global rate is the sum of the rates of immigrants anddescendants and equal to R = R exo + R endo = µ + µn − n = µ − n . (16)And the proportion of descendants (endogenously driven events) in the whole system is equalto the branching ratio: R endo R = n. (17)Calibrating n on the data therefore provides a direct quantitative estimate of the degree ofendogeneity.In the framework of the Hawkes model (3) with µ ( t ) = µ = const, the branching ratio n is easily deﬁned via the kernel h ( t ): n = Z ∞ h ( t ) dt. (18)For the exponential parametrization (6), the branching ratio, n = aτ , is equal to a dimension-less parameter n ≡ η previously introduced. The Hawkes framework provides a convenientway of estimating the branching ratio, n ≡ η , from the observations { t i } , using the Max-imum Likelihood method, which beneﬁts from the fact that the log-likelihood function isknown for Hawkes processes [60, 64]. The calibration of the model and estimation of thebranching ratio n can then be performed with the numerical maximization of Log-Likelihoodfunction in the parameter space { µ, n, τ } for the exponential kernel (6) and { µ, n, c, ϕ } forthe power law model (5). Despite being a relatively straightforward calibration procedure,special care should be taken with respect to data processing, choice of the kernel, robustnessof numerical methods and stationarity tests as discussed in details in [34].1 IV. THE EFFECTIVE DEGREE OF ENDOGENEITY IN THEAUTOREGRESSIVE CONDITIONAL DURATIONS (ACD) MODELA. Formal similarities between the ACD and Hawkes models

Note that the ACD( p , q ) and Hawkes models operate on diﬀerent variables with inversedimensions: duration δt for the ACD( p , q ) model and conditional intensity λ ( t |F t − ) for theHawkes model, which is of the order of the inverse 1 /δt of the duration δt . As a consequence,equations (21) and (16) apply to diﬀerent statistics (average durations E[ δt ] and averagerate R = E[1 /δt ]). Moreover, the ACD model cannot be exactly mapped onto a branchingstructure whereas the Hawkes process can.Indeed, the branching structure requires that the conditional probability for an eventto occur within the inﬁnitely small interval [ t, t + dt ) (which is the conditional intensity)should be decomposed into a sum of (1) a (deterministic or stochastic) function of time thatrepresents the immigration intensity and (2) the contributions f i from each past event t i that satisfy the following conditions: (i) these contributions should depend only on t i andbe independent from all other events t j < t ; (ii) these contributions should exhibit identicalstructure for all events; and (iii) they should satisfy the causality principle. Thus, in itsgeneral form, a conditional Poisson process that can be mapped on (multiple) branchingstructures if it is described by the following conditional intensity: λ ( t |F t − ) = µ ( t ) + X t i β k >

0, there is an ampliﬁcation of the average durations. Consideringthe average of eq. (11) in the stationary regime (E[ δt i − ] = E[ δt i ] and E[ ψ i − ] = E[ ψ i ]), andtaking into account eq. (9), we obtain the following expression for the mean duration in thestationary regime: E[ δt ] = ω − P pj =1 α j − P qk =1 β k ≡ ω − ζ . (21)Equations (21) and (16) share the same functional dependence, with a divergence when thecorresponding control parameters η and ζ approach 1. B. Empirical dependence of the eﬀective branching ratio ˆ η as a function of ζ = α + β for the ACD( , ) process In order to quantify the similarities between the ACD and Hawkes models outlined in theprevious section, we have performed the following numerical study. We simulated realizationsof the ACD(1,1) process and calibrated the Hawkes model to it. The traditional way ofﬁtting the Hawkes model uses the maximum likelihood method [64], which is asymptoticallynormal and asymptotically eﬃcient [60]. We have used the R package “PtProcess” [40],3which provides a convenient framework for Hawkes models (3) with arbitrary kernel h ( t )and background intensity µ ( t ). Then, we maximized the likelihood function using a Newton-type non-linear maximization [17, 75]. The B reports a study of the ﬁnite sample bias andeﬃciency of the Hawkes maximum likelihood estimator. We ﬁnd that the estimation error | ˆ η − η | of the branching ratio (without model error) measured with the 90% quantile rangesdoes not exceed 0 . η ≤ . ζ of themodel ACD( p , q ) and η of the exponential Hawkes model. For this, we have simulatedrealizations of the ACD process and estimated the parameter η from these realizations. Theparameter ω of the ACD( p , q ) (11) model deﬁnes the time scale. Without loss of generality,we let ω = 1, which accounts for a linear transformation of time ˜ t i = t i /ω in equations (9)and (11). For the sake of simplicity, we present our results for the ACD(1,1) model, forwhich the dimensionless parameter ζ reduces to ζ = α + β . However, our ﬁndings are robustto the choice of the order of the ACD model and can be easily generalized to the case of p, q >

1. The parameters α and β were chosen so that ζ = α + β spanned [0 ,

1] at 40equidistant points. For each of the 40 values of ζ , we have generated 100 realizations of thecorresponding exponential ACD(1,1) process. Each realization of 3500 events was generatedby a recursive algorithm using eq. (13). In order to minimize the impact of edge eﬀects thatcan bias the estimation of the branching ratio [34], the ﬁrst 500 points of each realizationwere discarded. Then, the Hawkes model (7) was calibrated on these synthetic datasets.For each calibration, we have performed a goodness-of-ﬁt test based on residual analy-sis [62], which consists of studying the so-called residual process deﬁned as the nonparametrictransformation of the initial time-series t i into ξ i = Z t i ˆ λ ( t |F t − ) dt, (22)where ˆ λ ( t |F t − ) is the conditional intensity of the Hawkes process (7) estimated with themaximum likelihood method. Under the null hypothesis that the data has been generatedby the Hawkes process (7), the residual process ξ i should be Poisson with unit intensity [66].Visual analysis involves studying the cusum plot or Q-Q plot and may be complementedwith rigorous statistical tests. Under the null hypothesis (Poisson statistics of the resid-ual process ξ i ), the inter-event times in the residual process, δξ i = ξ i − ξ i − , should beexponentially distributed with CDF F ( δξ ) = 1 − exp( − δξ ). Thus, the random variables4 U i ≡ F ( δξ i ) = 1 − exp( − δξ i ) should be uniformly distributed in [0 , ζ = α + β , α and β have diﬀerent impacts on the eﬀective degree of endogeneity η . For instance, case (B) α = 0 . , β = 0 .

13, and case(C) α = 0 . , β = 0 .

38 both have thesame ζ = 0 .

51 but ˆ η = 0 .

52 for (B) and ˆ η = 0 .

22 for (C). The smaller endogeneity found incase (C) is compensated by a higher rate of exogenous events (ˆ µ = 0 .

38 for (C) comparedwith ˆ µ = 0 .

24 for (B)), resulting in a “ﬂatter” conditional intensity for (C).In order to explore this eﬀect in simulations of the ACD model, for each value of ζ = α + β ,we considered diﬀerent relations between α and β : (i) α = β (= ζ / β = 0 ( α = ζ ),(iii) α = 0 ( β = ζ ), (iv) α = 3 β (= 3 ζ /

4) and (v) β = 3 α (= 3 ζ / α = 0 (case (iii)) versus α > α = 0, the estimatedeﬀective branching ratio ˆ η is 0 for all values of the control parameter ζ = β , as shown inFigure 3(B). This diagnoses a completely exogenous dynamics of the ACD process, which isindeed the expected diagnostic given that, for α = 0, eq. (9) and (13) reduce to δt i = ψ i ǫ i , ψ i = ω + βψ i − , (23)for which the dynamics of the conditional durations { ψ i } is purely deterministic and inde-5pendent of the realized durations δt i , while the later are entirely driven by the random term ǫ i . [Insert Figure 3 here]For α >

0, we ﬁnd similar non-trivial results. Figure 3(B) shows the eﬀective branchingratio ˆ η as a monotonously increasing function of ζ for all combinations of α = 0 and β . Incases (ii) ( β = 0) and (iv) ( α = 3 β ), the dependence of ˆ η on ζ is almost linear for ζ < . ζ < . ζ , the convexity increases. In case (i) ( α = β ),ˆ η depends linearly on ζ for ζ > . α = 3 β ), the curvature of ˆ η ( ζ ) is signiﬁcant over the range of 0 . < ζ < .

9. Remarkably,all four dependencies converge to the same value ˆ η ≈ . ζ = 1.Figure 4 presents the dependence of the eﬀective branching ratio ˆ η on the control param-eter ζ = α + β after correction of the bias in estimation due to ﬁnite size eﬀects presentedin the B and summarized in ﬁgure 7). All dependencies of ˆ η as a function of ζ converge tothe critical value ζ = 1.[Insert Figure 4 here]Figure 5 generalizes ﬁgure 3(B) by presenting the dependence of the eﬀective branchingratio ˆ η (corrected for the ﬁnite sample bias determined in the B) on the parameters α and β separately. As expected, the impact of a change of α is much larger than that of β . Thereis a region, delineated by the dashed line, within which the Hawkes model is rejected at the5% level for the Kolmogorov-Smirnov test. For most combinations of α and β such that0 . . α + β . .

95, the Hawkes model is rejected. Interestingly, the Hawkes model is notrejected in the case where β is kept signiﬁcantly larger than α , and it is only rejected in asmall interval in the extreme opposite case where β ≡

0. The model is often not rejectedfor large values of ˆ η . C. Diﬀerences between the ACD and Hawkes models

Despite similarities, the Hawkes and ACD models exhibit some important diﬀerences.Figure 3A shows that the eﬀective background rate ˆ µ estimated by the Hawkes model isa decreasing function of the control parameter ζ . This is an indirect consequence of the6dependence of the expected duration on ζ given by expression (21). In contrast to theHawkes model (2), for which the background rate µ ( t ) completely describes the exogenousimpact on the system, the parameter ω of the ACD model (11) is not the only factorembodying the exogenous activity and there is no strict decoupling between the exogenousdriver ω and endogenous level ζ as occurs for the parameters µ and η of the Hawkes model.In other words, in contrast to the Hawkes model, the ACD in its classical form (11) doesnot provide a clean distinction between exogenous and endogenous activities.Another diﬀerence between the Hawkes and ACD models can be observed in ﬁgure 3(D),which presents a residual analysis of the calibration of realizations of the ACD process by theHawkes model using the Kolmogorov-Smirnov test. The null hypothesis that the realizationsof the ACD process are generated by the Hawkes model is rejected at the 5% conﬁdence levelfor ζ > . α = β ). For case (ii) ( β = 0) and (iv) ( α = 3 β ), the null hypothesis isrejected for even lower ζ > .

4. However, for case (v) ( β = 3 α ), the null cannot be rejectedfor almost all values of the control parameter ζ , except for a small interval around ζ ≈ . D. Inﬂuence of the memory kernel h ( t ) of the calibrating Hawkes process Finally, we need to discuss the choice of the kernel h ( t ) in the speciﬁcation of the Hawkesmodel (3) used in the calibration of the realizations generated with the ACG process. The useof the exponential kernel (6) is a priori justiﬁed by the short memory of the ACD(1,1) process.Indeed, the autocorrelation function of the ACD(1,1) model decays exponentially [7], andthe same can be shown explicitly for the GARCH(1,1) model [92]. The choice of a short-memory exponential kernel for the Hawkes model ensures Markovian properties with a fastdecaying autocorrelation function of the durations [59]. High p-values of the goodness-of-ﬁttests for parameters ζ < . k = 3 and k = 4 respectively), we comparethem using the Akaike information criterion (AIC) [2]. The AIC is by far the most popularmodel comparison criterion used in the point process literature [35]. The AIC penalizes7complex models by discounting the likelihood function L by the number k of parameters ofthe model. Speciﬁcally, the AIC suggests selecting the model with a minimum AIC value,where

AIC = 2 k − L .[Insert Table 1 here]Table 1 gives the results for the realizations presented in ﬁgure 2. In terms of likelihood,the exponential and power law kernels give practically identical values (log L exp ≈ log L pow ).Penalizing model complexity with the AIC widens the gap, and the exponential kernel withone fewer parameters is selected under the AIC.Notwithstanding their apparent strong diﬀerence, the estimated background intensities(ˆ µ ) and branching ratios (ˆ n ) are almost the same for both memory kernels. This canexplained by the fact that the parameters ˆ ϕ and ˆ c estimated for the power law kernel(5) are such that the later remains very close to an exponential kernel over a large timeinterval, as illustrated by ﬁgure 6 for case C ( α = 0 . β = 0 . h ( t ) = h ( t ) /η = τ − exp( − t/τ )) and the powerlaw kernel ˜ h ( t ) = c − ϕ ( ϕ − t + c ) − ϕ . The corresponding ML estimates of their parametersare respectively ˆ τ = 7 .

76, ˆ ϕ = 105 .

17 and ˆ c = 816 .

41. The large value of the estimatedexponent ˆ ϕ (of the order of 100) implies a fast decay, similar to an exponential function.Correlatively, the large value of the constant ˆ c implies the absence of the hyperbolic range(or “long tail”) as well. The almost perfect coincidence is observed for up to times t ≃ h ( t ) decay by a factor of almost 50. For t >

50 for which the relativediﬀerence between the two kernels exceed 20%, the absolute values of h ( t ) is less than 2 · − so that the contribution of time scales beyond t = 50 to the total intensity (7),(8) becomesinsigniﬁcant.[Insert Figure 6 here] V. CONCLUSION

The present article positions itself within the neoclassical ﬁnancial literature that inves-tigates the nature of the mechanisms that drive ﬁnancial prices. The benchmark, calledthe “Eﬃcient Market Hypothesis” (EMH) [27, 28, 73, 74], holds that markets only reacts8to external inputs (information ﬂow) and almost instantaneously reﬂect these inputs inthe price dynamics. This purely exogenous view on price formation has been contradictedby many empirical observations (see for instance the original works [16, 76] and more re-cent [26, 50, 81]), which show that only a minor fraction of price movements can be explainedby relevant news releases. This implies a signiﬁcant role for internal feedback mechanisms.Using the framework of Hawkes processes, two of us [30, 32] have used the correspondingbranching ratio to provide what is, to the best of our knowledge, the ﬁrst quantitativeestimate of the degree of endogeneity in ﬁnancial markets. This degree of endogeneity ismeasured as the proportion of price moves resulting from endogenous interactions amongthe total number of all price moves (including both endogenous interactions and exogenousnews). These works provided a solid counter-example of short-term “ineﬃciency” of ﬁnancialmarkets, which was complemented with the similar conﬁrmation from longer time scales [38].The later work, though, is subjected to a number of numerical biases, as shown in [34], andtriggered an ongoing discussion about the nature of long-range memory and criticality.In this context, the present article expands the quantiﬁcation of endogeneity to the classof Autoregressive Conditional Duration (ACD) point processes. This is done by the intro-duction of the composite parameter ζ (20) associated with the parameters α j and β k , whichcontrol the dependence of the conditional expectated duration between events as a functionof past realized duration and past conditional expected duration. We have shown that theparameter ζ can be mapped onto the branching ratio η that directly measures the levelof endogeneity within the framework of the Hawkes self-excited conditional Poisson model.This result leads to a novel interpretation of the various studies that analyzed high-frequencyﬁnancial data with the ACD model.An important conclusion derives from our mapping of the ACD onto the Hawkes process.Both original works [21, 23, 69] as well as more recent studies reviewed in Refs. [24, 65] havereported estimated parameters α j and β k that combine to extremely large values of ζ , oftenlarger than 0 .

5, and up to 0 .

95. From the perspective oﬀered by the present work and inparticular from the mapping of ζ onto η , these empirical ﬁndings provide strong support tothe hypothesis of a dominant endogenous or “reﬂexive” [85] component in the dynamics ofﬁnancial markets.The present work oﬀers itself to a natural extension beyond point processes to the classof discrete time processes. There are several successful models of self-excitation within9a discrete time framework, such as AR (auto-regressive), ARMA (auto-regressive movingaverage) [36] and GARCH models [8] and their siblings, as well as the recently introducedSelf-Excited Multifractal (SEMF) model [31], that extends Quasi-Multifractal models [70,72] by introducing explicit feedback mechanism. However, until now, there has been noframework that provides a direct quantiﬁcation and estimation of the degree of endogeneitypresent in a given time series. As discussed above, the ACD( p , q ) model in fact belongs to theclass of GARCH( p , q ) models, though not with normally distributed innovations but insteadwith iid distributed innovations with a Poisson distribution. By extension, this suggestsa direct application of our present ﬁndings to GARCH models. This correspondence willbeneﬁt from the elaborate toolbox of calibration methods and the detailed accumulatedknowledge of the statistical properties of GARCH models [55, 92].0 Appendix A: Financial applications of the Hawkes and ACD models

Both Hawkes and ACD-type models belong to a class of point processes and describestochastic arrival times of events of some kind. Since the key variable of these models is thearrival time, selection of what deﬁnes an event is extremely important both for numericalanalysis and for the diagnostic of the exogenous and endogenous mechanisms. In [30], anumber of endogenous mechanisms that exist in ﬁnancial markets are listed — ranging fromhigh frequency trading to behavioural herding at longer time scales. These mechanismsoperate on diﬀerent time scales, and have diﬀerent magnitudes. Thus, the appropriateevents must be deﬁned to capture (and hopefully isolate) the dynamics of the mechanismof interest. Below, we present a non-exhaustive review of modern ﬁnancial applications ofHawkes and ACD models.As discussed in the introduction, high-frequency applications of Hawkes and ACD modelsare by far dominant in modern ﬁnancial econometrics (see also [41]). In the context ofthe description of the order-book formation process, events can be naturally deﬁned as asequence of individual transactions [21, 23, 46] or quotes [22], or more detailed as a set ofmutually-exciting processes of submission and cancellation of limit orders and submission ofmarket orders [1, 48, 51, 87]. On the aggregate level, the last transaction price change canserve as a proxy for cross-excitation between diﬀerent markets [3, 4]. Following the modernliterature on price impacts (see [10] and references therein), [32] and [30] suggested mid-quote price as a better proxy for price movements and mid-quote price changes were usedfor the estimation of the endogeneity of the price dynamics. In [11] and [5], the co-excitationbetween market orders and mid-price changes was used to model market impact.However, applications of self-excited point processes are not limited to microstructureevents (that can be deﬁned only using complete order ﬂow or level-1 tick data). In case ofregularly-spaced discrete time time series (such as minutely, hourly or daily price dynamics),events can be deﬁned as some kind of “extremes” in the dynamics. The most standard way(see for instance [13, 19] with respect to applications to daily data) deﬁnes events using the“peak-over-threshold” concept: for a given dynamics of ﬁnancial returns, one selects thosereturns that fall outside a selected quantile range (for example 10%–90%). The resultingirregularly-spaced point process can then be calibrated using the Hawkes or ACD model.A more accurate approach should account for potential changes of regime and volatility1clustering, and thus should use local extreme detection methods, such as the realized bi-power variation [6] ([9] apply this method to model co-jumps in time-series of 1-minutereturns).Another interesting, but not yet explored application of point process models, involvesdetecting regime (or trend) changes in price dynamics and deﬁning a point process usingturning points. The simplest way is to deﬁne a local minima and maxima at a ﬁxed time-scale in the discrete time series, and use these extrema to construct a point process. Moreaccurate trend detection would involve local volatility estimation, such as method of drawupand drawdown detection (consecutive positive or negative price changes) discussed in [49].However, one needs to be warned that: (i) most trend detection methods are not causal andrequire information about the future price dynamics, thus they are not well-suited for fore-casting purposes; and (ii) all these methods are based on conditional statistics that shouldbe treated carefully in order to avoid spurious phenomena even in featureless processes [33].A general recommendation is to always consider one or several well-known processes (suchas the uncorrelated random walk) and apply ﬁrst the new method to these known processesto check if the event deﬁning procedure might not introduce some spurious endogeneity.Finally, in modeling both micro- and macro-structure of ﬁnancial time series, the magni-tude of events (size of orders, size and sign of price changes or jumps) can be relevant. In thiscase, a marked Hawkes model may be considered in which the size of the event determinesits expected number of oﬀspring, such as in the ETAS model for earthquakes for which themarks are the earthquake magnitudes [62, 90, 91].

Appendix B: Finite sample bias of the Hawkes maximum likelihood estimator

In order to optimize the calibration of the Hawkes model on the ACD(1,1), we study theﬁnite sample bias and eﬃciency of the Hawkes maximum likelihood estimator. For this, wehave simulated realizations of the Hawkes process with a modiﬁed thinning procedure [53, 61]implemented in the same “PtProcess” package [40], and afterwards we have calibrated theHawkes model on this synthetic data. It should be noted that simulation (and ﬁtting [23]) ofthe ACD model is computationally easier than for the Hawkes model. Indeed, simulation ofthe Hawkes process with the thinning algorithm has complexity of O( N ) (with possibilityto reduce to O( N log N ) [57, 58]), compared with complexity of only O( N ) for the ACD(1,1)2model.[Insert Figure 7 here]We swept the parameter η in the range [0 , µ = 1 and τ = 1.We generated 100 realizations of the Hawkes process of size 3500 each. To reduce the edgeeﬀects of the thinning algorithm, we discarded the ﬁrst 500 points of each realization andafterwards calibrated the parameters of the Hawkes model on these realizations of length3000. Figure 7 illustrates the bias and eﬃciency of the maximum likelihood estimator in ourframework. The deﬁnition of the Hawkes model (3) requires the kernel h ( t ) to be alwayspositive. This implies η ≥

0, so the estimation of η is expected to have positive bias for smallvalues, as seen in ﬁgure 7. On the other hand, when η approaches the critical value of 1 frombelow, the memory of the system increases dramatically and, for critical state of η = 1, thememory becomes inﬁnite. Thus, for a realization of limited length, the ﬁnite size will play avery important role and will result in a systematic negative bias for η .

1. This reasoningis supported by the evidence presented in ﬁgure 7, where one observes large systematic biasfor η > .

9. For values of the branching ratio not too close to 0 or 1, the bias is very smallfor almost all reasonable realization lengths (longer than 200 to 400 points). We also ﬁndthat the bias for η close to 1 strongly depends on the realization length. Finally, ﬁgure 7illustrates the high eﬃciency of the maximum likelihood estimator: for values of η < . | ˆ η − η | measured with the 90% quantile ranges does not exceed 0 . [1] Abergel, F., & Jedidi, A. (2011). A Mathematical Approach to Order Book Modelling. In Econophysics of Order-Driven Markets , (pp. 93–107). Springer Verlag.[2] Akaike, H. (1974). A new look at the statistical model identiﬁcation.

IEEE Transactions onAutomatic Control , (6), 716–723.[3] Bacry, E., Delattre, S., Hoﬀmann, M., & Muzy, J.-F. (2011). Modeling microstructure noiseusing Hawkes processes. Proceedings of the ICASSP 2011 , (pp. 5740–5743).[4] Bacry, E., Delattre, S., Hoﬀmann, M., & Muzy, J.-F. (2013). Modeling microstructure noisewith mutually exciting point processes.

Quantitative Finance , (1), 65–77.[5] Bacry, E., & Muzy, J.-F. (2013). Hawkes model for price and trades high-frequency dynamics.[6] Barndorﬀ-Nielsen, O., & Shephard, N. (2004). Power and bipower variation with stochasticvolatility and jumps. Journal of Financial Econometrics , (1), 1.[7] Bauwens, L., & Hautsch, N. (2009). Modelling Financial High Frequency Data Using PointProcesses. In T. Mikosch, J.-P. Kreiß, R. A. Davis, & T. G. Andersen (Eds.) Handbook ofFinancial Time Series , (pp. 953–979). Springer.[8] Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity.

Journal ofEconometrics , (3), 307–327.[9] Bormetti, G., Calcagnile, L. M., Treccani, M., Corsi, F., Marmi, S., & Lillo, F. (2013).Modelling systemic cojumps with Hawkes factor models.[10] Bouchaud, J.-P., Farmer, J. D., & Lillo, F. (2009). How markets slowly digest changes insupply and demand. In Handbook of Financial Markets: Dynamics and Evolution , (pp. 57–160). Amsterdam: North Holland.[11] Bowsher, C. G. (2007). Modelling security market events in continuous time: Intensity based,multivariate point process models.

Journal of Econometrics , (2), 876–912.[12] Br´emaud, P., & Massouli´e, L. (1996). Stability of nonlinear Hawkes processes. The Annals ofProbability , (3), 1563–1588.[13] Chavez-Demoulin, V., Davison, A. C., & McNeil, A. J. (2005). Estimating value-at-risk: apoint process approach. Quantitative Finance , (2), 227–234.[14] Cont, R. (2011). Statistical Modeling of High Frequency Financial Data: Facts, Models andChallenges. IEEE Signal Processing , (5), 16–25. [15] Crane, R., & Sornette, D. (2008). Robust dynamic classes revealed by measuring the responsefunction of a social system. Proceedings of the National Academy of Sciences of the UnitedStates of America , (41), 15649–15653.[16] Cutler, D. M., Poterba, J. M., & Summers, L. H. (1987). What moves stock prices? Journalof Portfolio Management , (3), 4–12.[17] Dennis, J. E., & Schnabel, R. B. (1987). Numerical Methods for Unconstrained Optimizationand Nonlinear Equations , vol. 16 of

Classics in Applied Mathematics . Society for IndustrialMathematics.[18] Deschˆatres, F., & Sornette, D. (2005). Dynamics of book sales: Endogenous versus exogenousshocks in complex networks.

Physical Review E , (1), 016112.[19] Embrechts, P., Liniger, T., & Lu, L. (2011). Multivariate Hawkes Processes: an Applicationto Financial Data. J. Appl. Probab. , , 367–378.[20] Engle, R. F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of theVariance of United Kingdom Inﬂation. Econometrica: Journal of the Econometric Society , (4), 987–1007.[21] Engle, R. F. (2000). The Econometrics of Ultra-High-Frequency Data. Econometrica: Journalof the Econometric Society , (1), 1–22.[22] Engle, R. F., & Russell, J. R. (1997). Forecasting the frequency of changes in quoted foreignexchange prices with the autoregressive conditional duration model. Journal of EmpiricalFinance , (2-3), 187–212.[23] Engle, R. F., & Russell, J. R. (1998). Autoregressive Conditional Duration: A New Modelfor Irregularly Spaced Transaction Data. Econometrica: Journal of the Econometric Society , (5), 1127–1162.[24] Engle, R. F., & Russell, J. R. (2009). Analysis of High Frequency Financial Data. In Handbookof Financial Econometrics , (pp. 383–426). North Holland.[25] Errais, E., Giesecke, K., & Goldberg, L. R. (2010). Aﬃne Point Processes and Portfolio CreditRisk.

SIAM Journal on Financial Mathematics , (1), 642.[26] Fair, R. C. (2002). Events That Shook the Market. Journal of Business , (4), 713–731.[27] Fama, E. F. (1970). Eﬃcient Capital Markets: A Review of Theory and Empirical Work. TheJournal of Finance , (2), 383–417.[28] Fama, E. F. (1991). Eﬃcient capital markets: II. Journal of Finance , (5), 1575–1617. [29] Fernandes, M., & Grammig, J. (2006). A family of autoregressive conditional duration models. Journal of Econometrics , (1), 1–23.[30] Filimonov, V., Bicchetti, D., Maystre, N., & Sornette, D. (2014). Quantiﬁcation of the HighLevel of Endogeneity and of Structural Regime Shifts in Commodity Markets. Journal ofInternational Money and Finance , , 174–192.[31] Filimonov, V., & Sornette, D. (2011). Self-excited multifractal dynamics. Europhysics Letters , (4), 46003.[32] Filimonov, V., & Sornette, D. (2012). Quantifying reﬂexivity in ﬁnancial markets: Toward aprediction of ﬂash crashes. Physical Review E , (5), 056108.[33] Filimonov, V., & Sornette, D. (2012). Spurious trend switching phenomena in ﬁnancial mar-kets. The European Physical Journal B - Condensed Matter and Complex Systems , (5),155.[34] Filimonov, V., & Sornette, D. (2013). Apparent criticality and calibration issues in the Hawkesself-excited point process model: application to high-frequency ﬁnancial data. Swiss FinanceInstitute Research Paper No. 13-60 .[35] Guttorp, P., & Thorarinsdottir, T. L. (2012). Bayesian Inference for Non-Markovian PointProcesses. In E. Porcu, J. M. Montero, & M. Schlather (Eds.)

Advances and Challenges inSpace-time Modelling of Natural Events , (pp. 79–102). Berlin, Heidelberg: Springer BerlinHeidelberg.[36] Hamilton, J. D. (1994).

Time Series Analysis . Princeton University Press.[37] Hamilton, J. D., & Jorda, O. (2002). A Model for the Federal Funds Rate Target.

Journal ofPolitical Economy , , 1135–1167.[38] Hardiman, S. J., Bercot, N., & Bouchaud, J.-P. (2013). Critical reﬂexivity in ﬁnancial markets:a Hawkes process analysis. The European Physical Journal B - Condensed Matter and ComplexSystems , (10), 442.[39] Harris, T. E. (2002). The Theory of Branching Processes . Dover Phoenix Editions.[40] Harte, D. (2010). PtProcess: An R package for modelling marked point processes indexed bytime.

Journal of Statistical Software , (8), 1–32.[41] Hautsch, N. (2012). Econometrics of Financial High-Frequency Data . Berlin, Heidelberg:Springer Berlin Heidelberg.[42] Hawkes, A. G. (1971). Point Spectra of Some Mutually Exciting Point Processes.

Journal of the Royal Statistical Society. Series B (Methodological) , (3), 438–443.[43] Hawkes, A. G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika , (1), 83–90.[44] Hawkes, A. G., & Oakes, D. (1974). A Cluster Process Representation of a Self-ExcitingProcess. Journal of Applied Probability , (3), 493–503.[45] Helmstetter, A., & Sornette, D. (2003). Importance of direct and indirect triggered seismicityin the ETAS model of seismicity. Geophysical Research Letters , (11), 1576.[46] Hewlett, P. (2006). Clustering of order arrivals, price impact and trade path optimisation. InWorkshop on Financial Modeling with Jump processes, Ecole Polytechnique .[47] Jasiak, J. (1998). Persistence in intratrade durations.

Financial Analysts Journal , , 166–195.[48] Jedidi, A., & Abergel, F. (2013). On the Stability and Price Scaling Limit of a HawkesProcess-Based Order Book Model.[49] Johansen, A., & Sornette, D. (2001). Large stock market price drawdowns are outliers. Journalof Risk , (2), 69–110.[50] Joulin, A., Lefevre, A., Grunberg, D., & Bouchaud, J.-P. (2008). Stock price jumps: newsand volume play a minor role. Wilmott Magazine , Sep/Oct , 46.[51] Large, J. (2007). Measuring the resiliency of an electronic limit order book.

Journal ofFinancial Markets , (1), 1–25.[52] Lewis, E., Mohler, G. O., Brantingham, P. J., & Bertozzi, A. L. (2012). Self-exciting pointprocess models of civilian deaths in Iraq. Security Journal , , 244–264.[53] Lewis, P. A. W., & Shedler, G. S. (1979). Simulation of nonhomogeneous poisson processesby thinning. Naval Research Logistics Quarterly , (3), 403–413.[54] Marsan, D., & Lengline, O. (2008). Extending Earthquakes’ Reach Through Cascading. Science , (5866), 1076–1079.[55] Mikosch, T., Kreiß, J.-P., Davis, R. A., & Andersen, T. G. (Eds.) (2009). Handbook ofFinancial Time Series . Springer, 1 ed.[56] Mohler, G. O., Short, M. B., Brantingham, P. J., Tita, G. E., & Schoenberg, F. P. (2011). Self-Exciting Point Process Modeling of Crime.

Journal of the American Statistical Association , (493), 100–108.[57] Møller, J., & Rasmussen, J. G. (2005). Perfect simulation of Hawkes processes. Advances in applied probability , (3), 629–646.[58] Møller, J., & Rasmussen, J. G. (2006). Approximate Simulation of Hawkes Processes. Method-ology and Computing in Applied Probability , (1), 53–64.[59] Oakes, D. (1975). The Markovian Self-Exciting Process. Applied Probability Trust , (1),69–77.[60] Ogata, Y. (1978). The asymptotic behaviour of maximum likelihood estimators for stationarypoint processes. Annals of the Institute of Statistical Mathematics , (1), 243–261.[61] Ogata, Y. (1981). On Lewis’ simulation method for point processes. IEEE Transactions onInformation Theory , (1), 23–31.[62] Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for pointprocesses. Journal of the American Statistical Association , (401), 9–27.[63] Osorio, I., Lyubushin A. & Sornette D. (2014). Self-exciting component of brain activity. working paper .[64] Ozaki, T. (1979). Maximum likelihood estimation of Hawkes’ self-exciting point processes. Annals of the Institute of Statistical Mathematics , (1), 145–155.[65] Pacurar, M. (2008). Autoregressive Conditional Duration (ACD) Models in Finance: A Surveyof the Theoretical and Empirical Literature. Journal of Economic Surveys , (4), 711–751.[66] Papangelou, F. (1972). Integrability of Expected Increments of Point Processes and a RelatedRandom Change of Scale. Transactions of the American Mathematical Society , , 483–506.[67] Reynaud-Bouret, P., & Schbath, S. (2010). Adaptive estimation for Hawkes processes; appli-cation to genome analysis. The Annals of Statistics , (5), 2781–2822.[68] Ruelle, D. (2004). Conversations on nonequilibrium physics with an extraterrestrial. PhysicsToday , (5), 48–53.[69] Russell, J. R. (1999). Econometric Modeling of Multivariate Irregularly-Spaced High-Frequency Data. Working Paper, University of Chicago , (pp. 1–40).[70] Saichev, A., & Filimonov, V. (2008). Numerical Simulation of the Realizations and Spectraof a Quasi-Multifractal Diﬀusion Process .

JETP Letters , (9), 506–510.[71] Saichev, A., Helmstetter, A., & Sornette, D. (2005). Anomalous Scaling of Oﬀspring andGeneration Numbers in Branching Processes. Pure and Applied Geophysics , , 1113–1134.[72] Saichev, A., & Sornette, D. (2006). Generic multifractality in exponentials of long memoryprocesses. Physical Review E , (1), 011111+. [73] Samuelson, P. A. (1965). Proof That Properly Anticipated Prices Fluctuate Randomly. In-dustrial Management Review , , 41–49.[74] Samuelson, P. A. (1973). Proof That Properly Discounted Present Values of Assets VibrateRandomly. The Bell Journal of Economics and Management Science , (2), 369–374.[75] Schnabel, R. B., Koonatz, J. E., & Weiss, B. E. (1986). A modular system of algorithms forunconstrained minimization. ACM Transactions on Mathematical Software , (4), 419–440.[76] Shiller, R. J. (1981). Do Stock Prices Move Too Much to be Justiﬁed by Subsequent Changesin Dividends? The American Economic Review , (3), 421–436.[77] Shorvon, S., Guerrini, R., Cook, M., & Lhatoo, S. (Eds.) (1012). Oxford Textbook of Epilepsyand Epileptic Seizures . Oxford Textbooks in Clinical Neurology. OUP Oxford.[78] Sornette, D. (2003).

Why Stock Markets Crash: Critical Events in Complex Financial Systems .Princeton University Press.[79] Sornette, D. (2006). Endogenous versus Exogenous Origins of Crises. In S. Albeverio,V. Jentsch, & H. Kantz (Eds.)

Extreme events in nature and society , (pp. 95–119). Berlin,Heidelberg: Springer Berlin Heidelberg.[80] Sornette, D., Deschˆatres, F., Gilbert, T., & Ageon, Y. (2004). Endogenous Versus ExogenousShocks in Complex Networks: An Empirical Test Using Book Sale Rankings.

Physical ReviewLetters , (22), 228701.[81] Sornette, D., Malevergne, Y., & Muzy, J.-F. (2003). What causes crashes? Risk , (2),67–71.[82] Sornette, D., & Osorio I. (2010). Prediction In Epilepsy: The Intersection of Neurosciences,Biology, Mathematics, Physics and Engineering , (pp. 203–237). CRC Press, Taylor & FrancisGroup.[83] Sornette, D., & Ouillon, G. (2005). Multifractal Scaling of Thermally Activated RuptureProcesses.

Physical Review Letters , (3), 038501+.[84] Sornette, D., & Utkin, S. (2009). Limits of declustering methods for disentangling exogenousfrom endogenous events in time series with foreshocks, main shocks, and aftershocks. PhysicalReview E , (6), 061110.[85] Soros, G. (1987). The Alchemy of Finance: Reading the Mind of the Market . NY: John Wiley& Sons.[86] Stratonovich, R. L. (1992).

Nonlinear Nonequilibrium Thermodynamics I: Linear and Non- linear Fluctuation-Dissipation Theorems . Springer-Verlag.[87] Toke, I. M. (2011). “Market making” in an order book model and its impact on the spread.In Econophysics of Order-Driven Markets , (pp. 49–64). Springer Verlag.[88] Utsu, T. (1961). A statistical study of the occurrence of aftershocks.

Geophysical Magazine , , 521–605.[89] Utsu, T., & Ogata, Y. (1995). The centenary of the Omori formula for a decay law of aftershockactivity. Journal of Physics of the Earth , (1), 1–33.[90] Vere-Jones, D. (1970). Stochastic Models for Earthquake Occurrence. Journal of the RoyalStatistical Society. Series B (Methodological) , (1), 1–62.[91] Vere-Jones, D., & Ozaki, T. (1982). Some examples of statistical estimation applied to earth-quake data I. Cyclic Poisson and self-exciting models. Annals of the Institute of StatisticalMathematics , (1), 189–207.[92] Zakoian, J.-M., & Francq, C. (2010). GARCH Models: Structure, Statistical Inference andFinancial Applications . Oxford: Wiley-Blackwell.[93] Zhuang, J., Ogata, Y., & Vere-Jones, D. (2002). Stochastic declustering of space-time earth-quake occurrences.

Journal of the American Statistical Association , (458), 369–380. TABLE 1: Estimated parameters of the Hawkes model with exponential (6) and power law (5)kernels together with values of log-likelihood (log L exp and log L pow ) and Akaike information cri-terion ( AIC exp and

AIC pow ) for cases presented in ﬁgure 2. Bold font identiﬁes the lowest AICvalue among the two models. α β θ

H,exp = (ˆ µ, ˆ n, ˆ τ ) θ H,pow = (ˆ µ, ˆ n, ˆ c, ˆ ϕ ) log L exp log L pow AIC exp

AIC pow

A 0.05 0.05 (0.84, 0.07, 4.3) (0.83, 0.10, 262.77, 73.13) − . − . − . − . − . − . − . − . Time

FIG. 1: Illustration of the branching structure of the Hawkes process (top) and events on the timeaxis (bottom). This ﬁgure corresponds to a branching ratio n = 0 . (A) ACD { δ t i } Hawkes . . . λ ( t | F t ) . (B) { δ t i } . . . λ ( t | F t ) (C) { δ t i } . . . λ ( t | F t ) (D) { δ t i } . . . λ ( t | F t ) FIG. 2: Realizations of the durations and conditional intensities of the ACD(1,1) process (leftcolumn), and Hawkes process (right column) simulated with parameters obtained by calibrating tothe realization of the ACD process. Parameters of the ACD process θ ACD = ( ω, α, β ) and estimatedparameters of the Hawkes model ˆ θ H = (ˆ µ, ˆ η, ˆ τ ) are the following: (A) θ ACD = (1 , . , . θ H = (0 . , . , . θ ACD = (1 , . , . θ H = (0 . , . , . θ ACD = (1 , . , . θ H = (0 . , . , .

9) and (D) θ ACD = (1 , . , . θ H = (0 . , . , . z m . . . (i)(ii)(iii)(iv)(v) (A) z h . . . (i)(ii)(iii)(iv)(v) (B) z t (i)0.0 0.5 1.0 (ii)(iv)(v) (C) z p v a l ue . . . (i)(ii)(iv)(v) (D) FIG. 3: Results of the calibration of the Hawkes model on the ACD(1,1) realizations. Estimated(A) background intensity ˆ µ , (B) branching ratio ˆ η , and (C) characteristic time of the kernel ˆ τ .Panel (D) shows the p-value from the goodness-of-ﬁt test, where the dashed line indicates the the10% level. (i) α = β (= ζ/ β = 0 ( α = ζ ), (iii) α = 0 ( β = ζ ), (iv) α = 3 β (= 3 ζ/

4) and (v) β = 3 α (= 3 ζ/ α = β ), the shadedarea to the 95% quantile range for case (i), and the dotted lines depict mean p-values for cases (ii) β = 0, (iii) α = 0, (iv) α = 3 β and (v) β = 3 α . z h . . . (i)(ii)(iii)(iv)(v) FIG. 4: The estimated branching ratio ˆ η of the Hawkes model estimated on ACD(1,1) realizationswith correction for the ﬁnite sample estimation bias determined in the B. ab h ( a , b ) FIG. 5: Contour plot of the estimated branching ratio ˆ η ( α, β ) of the Hawkes model calibrated toACD(1,1) realizations for a grid of values α and β with α + β ≤

1, corrected for the ﬁnite sampleestimation bias determined in the B. The dashed line delineates the region where the goodness-of-ﬁttests rejects the null hypothesis (see text). −4 −3 −2 −1 t h ( t ) Exponential kernelPower law kernel

FIG. 6: Comparison of the exponential kernel with parameter τ = 7 . ϕ = 105 .

17 and c = 816 .

41 (dashed line). h m-m . . (A) h h-h − . − . . . (B) h t-t − (C) h p − v a l ue . . . (D) FIG. 7: Illustrations of the ﬁnite sample bias and variance of the maximum likelihood estima-tor [64] of the parameters of the Hawkes process calibrated on time series generated by the Hawkesprocess itself (no model error). Panel (A): diﬀerence between the estimates of the backgroundintensity ˆ µ and the true value µ used for the generation of the time series; Panel (B): diﬀerencebetween the estimates of the branching ratio ˆ η and the true value η ; Panel (C): diﬀerence betweenthe estimates of the characteristic time of the kernel ˆ τ and the true value ττ