Data Analytics Driven Controlling: bridging statistical modeling and managerial intuition
Kainat Khowaja, Danial Saef, Sergej Sizov, Wolfgang Karl Härdle
DData Analytics Driven Controlling: bridgingstatistical modeling and managerial intuition ∗ Kainat Khowaja † Danial Saef ‡ Sergej Sizov § Wolfgang Karl H¨ardle ¶ Abstract
Strategic planning in a corporate environment is often based on expe-rience and intuition, although internal data is usually available and can bea valuable source of information. Predicting merger & acquisition (M&A)events is at the heart of strategic management, yet not sufficiently moti-vated by data analytics driven controlling. One of the main obstacles inusing e.g. count data time series for M&A seems to be the fact that theintensity of M&A is time varying at least in certain business sectors, e.g.communications. We propose a new automatic procedure to bridge thisobstacle using novel statistical methods. The proposed approach allowsfor a selection of adaptive windows in count data sets by detecting signifi-cant changes in the intensity of events. We test the efficacy of the proposedmethod on a simulated count data set and put it into action on variousM&A data sets. It is robust to aberrant behaviour and generates accurateforecasts for the evaluated business sectors. It also provides guidance foran a-priori selection of fixed windows for forecasting. Furthermore, it canbe generalized to other business lines, e.g. for managing supply chains,sales forecasts, or call center arrivals, thus giving managers new ways forincorporating statistical modeling in strategic planning decisions. ∗ This research was supported by the Deutsche Forschungsgesellschaft through the Interna-tional Research Training Group 1792 ”High Dimensional Nonstationary Time Series”. Thisresearch is based upon work from COST Action 19130, supported by COST (European Co-operation in Science and Technology). Danial Saef acknowledges the support of PwC. † Humboldt-Universit¨at zu Berlin, International Research Training Group 1792, SpandauerStr. 1, 10178 Berlin, Germany. Email: [email protected] ‡ Humboldt-Universit¨at zu Berlin, International Research Training Group 1792, SpandauerStr. 1, 10178 Berlin, Germany. PwC, Germany. Email: [email protected] § PricewaterhouseCoopers, Germany. Email: [email protected] ¶ Humboldt-Universit¨at zu Berlin, BRC Blockchain Research Center, Berlin; Sim Kee BoonInstitute, Singapore Management University, Singapore; WISE Wang Yanan Institute forStudies in Economics, Xiamen University, Xiamen, China; National Chiao Tung University,Dept Information Science and Finance, Hsinchu, Taiwan, ROC; Charles University, DeptMathematics and Physics, Prague, Czech Republic; Grant CAS: XDA 23020303 and DFGIRTG 1792 gratefully acknowledged. Email: [email protected] a r X i v : . [ s t a t . A P ] D ec Introduction
Data driven insights can improve corporate decision making. In large organi-zations, a variety of financial data is available due to reporting requirementsand organizational purposes. However, expertise in the field of data analyticsis scarce. New methods for robotic data evaluation can help organizations cutcosts by shifting resources away from manual tasks and towards tasks that re-quire supervision. Managers can derive valuable insights from forecasts basedon internal company data to deal with common problems in a corporate environ-ment such as demand forecasting for supply chain planning (Yelland et al., 2010),call center arrival times (Taylor, 2007, 2011) and Oreshkin et al. (2016), salesforecasting (Kolsarici and Vakratsas, 2015), or mergers & acquistions (M&A)forecasting (Very et al., 2012). However, the available datasets are usually sub-ject to non-stationarity and structural breaks, and they usually make manualefforts to fit a meaningful model necessary. To deal with such problems, ex-perts employ techniques on change point detection or finding stable parameterwindows. To the best of our knowledge, no combination of such techniques wasexplored for a framework incorporating count data. To address this gap, wepropose a method that automatically detects locally homogeneous time win-dows and corresponding parameters in an automated way that can be used togenerate point or density forecasts.As a motivating example we apply this newly developed algorithm to forecastM&A intensity in different industries. M&A are especially interesting as theyfrequently occur in different markets and industries, and are relevant both to thefinancial industry that generates revenue by supporting their execution, as wellas those companies that are observing their own industry and their competitors.Recent approaches are using time series models, as in Very et al. (2012) or arevealed preference model as in Akkus et al. (2015). Figure 1 shows an example2ata set of mergers and acquisitions of German energy market that illustratesthe presence of non-stationarity and structural breaks in such time series.Figure 1: Time series of count of mergers and acquisitions per month, withmoving average curve of 1 year and 3 years. LPA EmpiricalstudyEmpirical evidence strongly suggests that mergers are often clustered in timeas waves, see Martynova and Renneboog (2005), Harford (2005) and Maksimovicet al. (2013). Ahern and Harford (2014) find that their activity is subjectto network effects and that these waves largely occur within industries, butcan also be transmitted to connected industries. Furthermore, shocks of anykind, even if they lead to merger waves, are difficult to predict. Wave patternsseem to be heterogeneous and differ both in time and in industries. Followingtheir argumentation, we conclude that data on M&A should be evaluated perindustry and geographic location due to differences in regulation, innovationpower, technology, and stock markets.Predictive models for M&A intensity could be identified through aggregat-ing acquired knowledge and tailored to specific industries and markets. Alter-3atively, time-series models, e.g. ARMA can be used. While a co-variate basedmodel provides explainability to the user, it requires manual efforts to gatherdata and incorporate expert knowledge to define relevant variables and calibratetheir impact on a predictive model. Data gathering can be time-consuming andexpensive, and knowledge about modeling heterogeneous industries in a specificapplication requires domain knowledge, which is often scarce and narrowed tosaid application. Time series models are an alternative, as they can be adaptedto any other data set with comparable structure. However, such models needto be robust to non-stationarity, structural breaks and wave patterns that limittheir predictive power.There is no doubt that only time varying time series models approximatethe dynamics of the underlying series better than any homogeneous parameterapproach, e.g. a fixed ARMA( p, q ) model. Therefore, we employ an adaptiveestimation method called Local Parametric Approach (LPA). The quantitativeimplementation was first proposed in Spokoiny (1998), advances are made inMercurio and Spokoiny (2004) and Spokoiny (2009). It helps us to find locallyhomogeneous time intervals with stable parameters and guarantees a trade-offbetween parameter variance and modeling bias. The technique is based on aseries of likelihood ratio tests to determine assumed but unknown change pointsin the underlying series. As a result one finds local intervals of homogeneity andefficient estimates at each point in time.Since we are often dealing with small sample sizes, and the test statisticdistribution is unknown, we need a method to approximate this distribution.Recent advances in bootstrapping methodology allow us to generate confidencesets and critical values that non-asymptotically approximate the true distri-bution. Here, we couple LPA with multiplier bootstrap (MBS) (Spokoiny andZhilova, 2015) for approximating a critical value for the testing procedure. MBS4uilds up on wild bootstrap, that originates from Wu (1986) and Beran (1986).An important application is reported in H¨ardle and Mammen (1993), and Mam-men (1993). Further advancements are made in Chatterjee and Bose (2005) andArlot et al. (2010). Notable publications that precede Spokoiny and Zhilova(2015) are B¨ucher and Dette (2013) and Chernozhukov et al. (2013). Klochkovet al. (2019) present an application in the context of a conditional autoregressiveValue at Risk model. We generalize this LPA idea to any data that is Poissonjump distributed, although our simulation study indicates that these assump-tions could be relaxed to the general membership to any of the exponentialfamilies. This indication is useful in bridging business requirements with therobustness of novel statistical methods through flexible automated estimations,since many problems in strategic management, e.g. forecasting M&A intensity,are subject to data sets for which strict assumptions on underlying distribution,stationarity, or absence of structural breaks may not be fulfilled.We detect locally homogeneous windows by computing a non-parametriclikelihood ratio statistics. This approach is related to the branch of changepoint detection methods. Chen and Gupta (2011), Eckley et al. (2011), andAminikhanghahi and Cook (2017) summarize and evaluate diverse methods ofchange point detection. Notable approaches are Hinkley and Hinkley (1970),(Hsu, 1979), (Haccou et al., 1987), and (Chen and Gupta, 1999) that proposechange point detection methods for gamma, exponentially, and normally dis-tributed data respectively. Kutoyants and Spokoiny (1999) propose an adaptiveprocedure as well as theoretical properties for Poisson distributed data. Chenand Gupta (2011) provides the null distribution of a likelihood ratio test forPoisson distributed random variables, but they do not evaluate the efficiency ofthe procedure on real data.Both change point detection and homogeneous window approaches can serve5s determinator for an optimal forecasting window. Recent approaches for opti-mal window selection are Giraitis et al. (2013), Pesaran et al. (2013), and Inoueet al. (2017). They address parameter instability and frequent structural breaksand indicate that adaptive window selection is favorable over choosing fixed win-dow sizes, such as a 1 year or 3 year moving average. Since we aim at developinga generally applicable method that is robust to different data characteristics,we need an adaptive window. Hence, we pursue a nonparametric approach thatis independent of knowledge about or assumptions on the dataset, except thatthe values are generated by a Poisson process with smooth but time varyingintensity over some unknown time window.Our approach extends the previous literature as it serves as a generic toolboxthat could easily be adapted to other applications and gives density forecaststhat can (but do not have to) be adjusted by incorporating knowledge fromindustry experts and is adaptable to arbitrary frequencies. Although we showhow to forecast the density of M&A, our methodology could also extend researchin other areas that typically use Poisson processes.The remainder of this paper is structured as follows: Section 2 describes thealgorithm that is based on a combination of LPA, MBS and put into actionin an iterative procedure. Section 3 contains an experimental study. We de-scribe the evaluation method, verify the robustness of the presented algorithmin simulation scenarios, apply it empirically on a dataset of mergers & acqui-sitions and show how it can be used to generate forecasts. Section 4 presentskey results, such as robustness to diverse data inputs and adaptability to otherapplications. Section 5 discusses limitations, such as in the evaluation approachor computational costs and suggests next steps like density forecasting, intro-ducing a judgemental component and extending the test-statistic. All numericalalgorithms can be found on Quantlet.de .6
Methodology
Planning processes in corporate environments are based on internal financialdata of different kinds. Traditionally, these problems have been solved usingdiverse time series models. However, it is difficult to use them since real timeseries often are non stationary and have structural breaks. Automated analysescan be beneficial as they make modelling easier. To contribute to solving thisproblem, we focus on detecting locally homogeneous intervals with stable pa-rameters. To be more specific, we focus on a count data model where the timevarying intensity determines a Poisson process. Take again figure 1 as an exam-ple. We aim to detect the years of ’95 - ’97 as a structural break and capturetherefore the non stationary component (a slight uptrend is observable). Wefind locally homogeneous windows, verifying that the procedure is working andshow how it can be used to obtain density forecasts.
Let Y t ∈ N , t = 0 , ..., T be a count data time series such as the count of M&Aseries, see in figure 1. Think of Y t ∼ P oisson ( θ ), where θ represents the rateor average number of occurrences in a fixed interval. Since we allow for timevariation in our model, for any interval I = [ a, b ] with a < b and a, b ∈ { , ..., T } ,we write ( Y t ) t ∈ I ∼ P oisson ( θ )The log likelihood function on I is: L I ( θ ) = (cid:88) t ∈ I log( θ Y t e − θ /Y t !) = log θ (cid:88) t ∈ I Y t − (cid:88) t ∈ I θ − (cid:88) t ∈ I log( Y t !) (1)7he MLE ˜ θ I based on observations in i ∈ I is:˜ θ I def = argmax θ ∈ Θ L I ( θ ) (2)which for a Poisson model is the sample mean. LPA, first introduced by Spokoiny (1998) is based on the phenomenon that aseries of locally parametric models can describe the features of a time seriesbetter than a global parametric model. The basic idea is that given a timeseries and a model for its dynamics, one finds locally stationary intervals of thetime series in an online fashion. This is done by finding the set of most recentobservations, such that the model parameters are approximately stable in thatinterval. This set of time points is called interval of homogeneity . Employingthe same procedure at each point in time, one locally estimates the parameter(H¨ardle et al., 2015). The merit of LPA is that it does not require an explicitexpression of the law of the dynamics of the parameter, but only assumes thatthe parameter is constant on some unknown time interval in the past (Spokoiny,2009).In order to check the homogeneity of an interval I = [ a, b ], LPA looks forsome break point τ ∈ ( a, b ) such that A τ = [ a, τ ) has one parameter and B τ =[ τ, b ] has another parameter. If at least one break point exists in the interval I ,we conclude that the interval is non-homogeneous (Klochkov et al., 2019).8he testing hypotheses are therefore: H ( I ) : ( Y t ) t ∈ I ∼ P oisson ( θ ∗ I ) , θ ∗ I ∈ Θ , vs H ( I ) : ( Y t ) t ∈ A τ ∼ P oisson ( θ ∗ A τ ) , θ ∗ A τ ∈ Θ , ( Y t ) t ∈ B τ ∼ P oisson ( θ ∗ B τ ) , θ ∗ B τ ∈ Θ , with some θ ∗ A τ (cid:54) = θ ∗ B τ (3)The LR test statistic for a breakpoint τ is: T I,τ = L A τ (˜ θ A τ ) + L B τ (˜ θ B τ ) − L I (˜ θ I ) (4)Since one has many candidates τ ∈ J , one arrives at: T I = max τ ∈ J T I,τ (5)This unfortunately has a very intractable distribution, hence the criticalvalues z I ( α ), indicating that the test indicates that the test statistic rejects H in equation (3) i.e. when T I ≥ z I ( α ) (6)is hard to calculate. Indeed the limiting distribution of T I is different fromgeneral likelihood ratio tests due to the presence of nuisance parameters (break-points) in the alternative hypothesis which are not identified under the nullhypothesis. Hence, convergence of the generalized LR statistics to a χ distri-bution according to Wilk’s phenomenon can not be put into action. While theasymptotic distribution of the sup-LR test in equation (5) can still be derived(Andrews and Ploberger, 1994), a large enough sample size is required for itsasymptotic critical values to be applicable. Certainly, that is not the case in9ost of the practical situations where only small samples of data are available.Spokoiny and Zhilova (2015) provide a non-asymptotic result for mis-specified models with small sampling sizes. The technique is called multiplierbootstrap (MBS) which is discussed in detail in the next section. Since the asymptotic distribution for the LR test statistic is not available forsmall samples, we approximate the unknown log-likelihood distribution usingthe bootstrap. First, introduce random weights to the previously defined likeli-hood function: L ◦ I ( θ ) = (cid:88) t ∈ I w t l t ( θ )where the weights w t are with E( w t ) = 1 and Var( w t ) = 1 iid . The bootstrapversion of (1) is given by: L ◦ I ( θ ) = log θ (cid:88) t ∈ I Y t w t − (cid:88) t ∈ I w t θ − (cid:88) t ∈ I log( Y t !) w t . (7)The bootstrap MLE is then defined as:˜ θ ◦ I = arg max L ◦ I ( θ ) , which is: ˜ θ ◦ I = (cid:80) t ∈ I Y t w t (cid:80) t ∈ I w t . It follows that the corresponding bootstrap of (10) is:10 ◦ I,τ = L ◦ A,τ (˜ θ ◦ A,τ ) + L ◦ B,τ (˜ θ ◦ B,τ ) − sup θ (cid:110) L ◦ A,τ ( θ ) + L ◦ B,τ ( θ + ˜ θ B,τ − ˜ θ A,τ ) (cid:111) , (8)A penalty term ˜ θ B,τ − ˜ θ A,τ is introduced to compensate for model misspec-ification bias. Klochkov et al. (2019) shows that the distribution of this testconditional on the data mimics the ”true” distribution of T I with high proba-bility. Using (8) we can obtain the critical value through simulations. Indeed,the critical value z ◦ I ( α ) is defined as: z ◦ I ( α ) = z ◦ I ( α ; Y ) = inf (cid:8) z ≥ T ◦ I > z / ≤ α (cid:9) . (9) The algorithm for an adaptive window length selection at each point in time isnow straightforward. It is based on sequential testing of the hypotheses on anested set of intervals { I k } k =0 , ,...,K , where I ⊂ I ⊂ . . . ⊂ I K . Let n k = | I k | bethe number of observations in each interval. The first interval I is assumed tobe homogeneous with length n . Then, for each interval I k , the null hypothesisof parameter homogeneity is tested against the alternative of a change point atan unknown location τ within I k , as in (3).Since the setup deals with nested intervals, but the existence and locationof a change point are unknown, only additional points in each new interval areconsidered as possible change points. The candidate set for change points ineach interval is defined as J k = I k \ I k − . Using each point τ ∈ J k , the leftand right intervals are constructed as A k,τ = [ i − n k +1 , τ ] and B k,τ = ( τ, i ]respectively (see figure (2)). The test statistic is calculated similarly to equation(5) as T I k ,τ = L A k,τ (˜ θ A k,τ ) + L B k,τ (˜ θ B k,τ ) − L Ik +1 (˜ θ I k +1 ) (10)11here A k,τ and B k,τ are as previously specified and we test at every point τ ∈ J k for a change point. The k th interval is rejected ifmax j ∈ J k T I k ,τ ≤ z ◦ I k ( α ) (11)and z ◦ I k is generated via multiplier bootstrap as explained in the previous sec-tion. If the interval I is not rejected, i.e. there exists no change point and itis homogeneous, we continue the testing procedure by choosing a bigger inter-val. Otherwise, the length of the last non-rejected interval (cid:98) I is the interval ofhomogeneity and (cid:98) θ i = (cid:98) θ (cid:98) I is the respective adaptive estimate of (cid:98) I .Homogeneity testing for I k utilizes also part of observations of I k +1 . Hence,the pre-definition of intervals is crucial. Following that, the choice of intervallengths affects the test results, and therefore requires careful selection. Weemploy a geometric increase of intervals like H¨ardle et al. (2015) and Klochkovet al. (2019). Based on the initial length n , the intervals lengths are defined by n k = (cid:24) n c k (cid:25) (12)where c is a geometric multiplier, chosen slightly above 1 to ensure a monotonicincrease of interval lengths, but not by a big margin. Furthermore, instead oftaking a constant number of intervals K for testing as proposed by H¨ardle et al.(2014) and Klochkov et al. (2019), we select K to be the smallest integer suchthat the whole time series is covered under a geometrically increasing length.In summary, the LPA algorithm for the adaptive choice of an interval ofhomogeneity and the corresponding MLE is given by the following iterativeprocedure:1. Initialization : Select I , I , I , and define J = I \ I , ( ∀ τ ∈ J ) , A ,τ =12igure 2: Iterative algorithm[ i − n , τ ], B ,τ = ( τ, i ].2. Iteration : For each iteration, select I k − , I k , I k +1 , , J k = I k \ I k − , ( ∀ τ ∈ J k ) A k,τ = [ i − n k +1 , τ ] , B k,τ = ( τ, i ].3. Testing homogeneity : Calculate test statistics in equation (10) andselect critical value with multiplier bootstrap. Test hypothesis in equation(3) using equation (11).4.
Loop : If I k is accepted, take the next interval I k +1 . Otherwise set (cid:98) I tothe latest non rejected I k .5. Adaptive estimator : Take interval (cid:98) I as interval of homogeneity and (cid:98) θ i = (cid:98) θ (cid:98) I as adaptive estimate of (cid:98) I . Repeat the procedure for each pointin time (different i ) The following lines seek to answer the question whether or not we can gener-ate better forecasts by using the described adaptive methodology, and whether13r not the methodology is robust with respect to previously unseen and thusunpredictable patterns. We compare one-step-ahead and multi-period point pa-rameter forecasts of the proposed locally adaptive procedure to a baseline of oneyear and three years moving averages in a pseudo-out-of-sample approach. Weacknowledge that this is a fairly simple approach. A more sophisticated evalu-ation approach would be to generate multi-period density forecasts that couldbe evaluated using a tailored loss-function as in Diebold et al. (1998); Diebold(2015), and Gonzalez-Rivera and Sun (2014). However, as this is beyond thescope of this paper (or altogether another paper), we leave it open for futurework.
This section evaluates the performance of the proposed technique using simu-lated datasets as shown in figure 3. We create scenarios that mimic commonpatterns in financial datasets. The simulations focus both on short-term shocksand regime shifts. Starting with the simplest piece-wise constant model, wegradually increase the complexity of the simulations by generating Poisson dis-tributed data and finally test the robustness of the methodology by changingthe underlying model to follow an exponential distribution. For each scenario,we consider a time series ( Y t ) t =1 with the following specifications:(a) Regime shifts with piece-wise constant model : ( Y t ) t =1 = 1, ( Y t ) t =101 = 10and ( Y t ) t =201 = 20(b) Regime shifts with Poisson model : ( Y t ) t =1 with θ = 1, ( Y t ) t =101 with θ = 10 and ( Y t ) t =201 with θ = 20(c) Short term shock with piece-wise constant model : ( Y t ) t =1 = 1, ( Y t ) t =200 =10 and ( Y t ) t =201 = 1 14d) Structural break with piece-wise constant model : ( Y t ) t =1 = 10, ( Y t ) t =181 =7 and ( Y t ) t =201 = 10(e) Structural break with Poisson model : ( Y t ) t =1 with θ = 5, ( Y t ) t =181 with θ = 1 and ( Y t ) t =201 with θ = 5(f) Regime shifts with exponential model : ( Y t ) t =1 with θ = 0 .
1, ( Y t ) t =101 with θ = 1 and ( Y t ) t =201 with θ = 1015 a)(b)(c)(d)(e)(f) Figure 3: Simulated series (left), homogeneous windows (middle) and MLE(right). LPA Simulations16he simulated data of scenarios (a)-(f) are shown in the time series plotson the left hand side in figure (3). Since the algorithm requires pre-selection ofthe intervals, we fix c in equation (12) at 1.35. This seemingly arbitrary choiceensures that we neglect only few unknown homogeneous intervals (if any), whilegaining computational efficiency. Assuming a minimal homogeneous window( n ) of 5 months, we compute K as described in section (2.5). For example,the candidate homogeneous windows for the latest period in the time seriesare [6 , , , , , , , , , , , , , K that the algo-rithm tests for. If K were to be increased, for example arithmetically, we wouldexpect the algorithm to generate a straight downward slopping line instead. Dueto computational limitations, we illustrate this only on scenario (a) in figure (4)and avoid such an experiment for other scenarios.17 a) Figure 4: Simulated series (left), homogeneous windows (middle) and MLE(right) using arithmetically increasing intervals in LPA. LPA SimulationsNext, a more realistic scenario with values generated from a Poisson processin (b) shows that the procedure is robust to noise even when the sample sizeis small. On the other hand, when n is small, the problem of multiple testingcauses many false negatives. This becomes evident in the first third of (b).This can be easily dealt with by changing the number of tested intervals in thealgorithm. Regardless of this issue, the MLE remains close to the true parametervalue.Further, some temporary shocks and small structural breaks are mimickedin simulations (c)-(e). (c) shows that a large shock is detected accurately, withan alarm to select smaller window size when the time is close to the change inmean. The simulation in (d) shows that temporary changes in mean are notalways recognised by the algorithm, but the MLE after such short-term shocksis still affected. The procedure also detects structural breaks within a Poissonframework accurately as depicted in (e), but the estimated time windows areinaccurate shortly after the break. With increasing n , the accuracy is restoredand the approximated MLE remains accurate.Finally, to check the robustness of the algorithm, values are generated froman exponential distribution. The simulation in (f) shows that the algorithmdetects breakpoints correctly even with exponentially distributed data. Accord-ingly, the assumption of Poisson distributed values can be easily relaxed towards18odels from any exponential family. This robustness allows the algorithm tobe used even with misspecified models.For each simulation scenario, we also compare the results with fixed windowestimates of 12 months (high variance) and 36 months (high bias) and showthe results in figure (5). All subplots show that LPA finds a balance betweenvariance and bias by selecting a smaller or bigger window size when necessary.Plot (a) shows that while the estimates of LPA fluctuate (as described above),the shift in regime is detected earlier by LPA than by fixed window estimates.Plots (b) and (e) show that for the simulated intervals with constant mean,fixed window estimates indicate a highly fluctuating mean, while LPA recognizesintervals of time homogeneity correctly. Moreover, the impact of the short termshock disappears faster in the LPA estimates in (c) and (d). However, LPAunderestimates the magnitude of shocks as visible in (d).In conclusion, the algorithm accurately detects small shocks, regime shiftsand temporary mean changes in time series without strict assumptions on theunderlying model. The results could be improved further by correct selection ofinterval lengths and numbers of intervals being tested, as they have a significantimpact on the precision and accuracy of the results.19 a) (b)(c) (d)(e) (f) Figure 5: Time series of simulated data and one step ahead prediction withestimates from LPA, 1 year fixed window and 3 year fixed window.LPA Simulations
In this section, we apply LPA to a working example of time series that arerelevant to businesses in the financial industry. We use a data set of mergers& acquisitions per month that we acquired from the database Refinitiv EikonDeals (restricted access through subscription). We consider different industriesaccording to the Thomson Reuters Business Classification scheme in Germany.20he dataset consists of a total of 9,969 observations in ten industries in theUS and Germany. We put the procedure into action on the following threeindustries in Germany:1. Financials2. Telecommunication3. EnergyDue to seemingly inconsistent recordings of transactions, we considered only424 observations from 01-1984 to 04-2020 in Germany for each industry. We donot make any assumptions regarding correlations of these industries or betweencountries.Following previous literature, we consider different industries separately.Without insider knowledge, the frequency of M&As could be assumed to berandom and i.i.d. over certain time windows, which is based on the intuitionthat we usually learn about mergers only after they are announced. The i.i.d.assumption states in this context that the observed industries are of a specific,somewhat stable structure in the short run, but can be interrupted by exogenousfactors that change industry dynamics. Given these characteristics, we modelthe number of M&A with a Poisson process and apply LPA to forecast the nextperiod given the MLE of the last observed homogeneous time window. We donot preprocess the data and let the procedure account for structural breaks andnon-stationarity.Similar to the simulation study in previous section, we fix n at 5 months,c at 1.35, compute K as in section (2.5), and run 100 simulations to smoothenthe estimated window sizes. We keep n and c for the interval selection throughequation (12) constant as an aim of the proposed algorithm is to evaluate un-seen data even if no knowledge about it is available. Some example cases are21llustrated to verify the robustness of the algorithm with respect to differenttypes of time series in figure (6). The plot on the right hand side shows thehomogeneous windows for each point in time while the corresponding LPA es-timate, 12 month estimate and 36 month estimate as one period ahead forecastare shown in the left plot. (1)(2)(3) Figure 6: Time series of original data and one step ahead prediction withestimates from LPA, 1 year fixed window and 3 year fixed window (left)Homogeneous windows (right) for (1) Financials, (2) Telecommunication and(3) Energy. LPA Empiricalstudy22he first plot in figure (6) shows the German financial industry, where anupward trend in the number of mergers can be observed over time. The plotshows that LPA estimates closely mimic this trend, and is much more respon-sive to shocks as in the mid-90s. Moreover, contrary to the green 1-year movingaverage curve, which shows high variance in the MLE estimate, LPA recognizesthe regions within time series where the average number of M&A is approxi-mately constant, such as from 2010 to 2020. The LPA estimates on the leftsuggest that the selected window size differs over time (sometimes up to 200,sometimes only a few observations) and using a fixed time window for generat-ing forecasts is not recommended according to the technique. Comparing thewindow plot of the same industry with the simulation plots (a) and (b) fromthe previous section, we identify six regimes in the German financial industry(1984-1988, 1988-1990, 1990-1994, 1994-1999, 1999-2006 and 2006-2020). Theseregimes could be associated with external events, such as: a merger wave in1984, globalization in late 1980s, the German bank merger wave in 1990s, Ger-man market liberalization in mid-90s, and the technological innovations in late20th century.The second part of figure (6) shows the German telecommunication industry,where the number of M&As per month seems stable and stationary, except fora short merger wave in the mid-90s due to the liberalization of the marketand another short wave around 2001 (possibly related to the dotcom-bubble).LPA suggests a stable time series as the algorithm recommends to select largewindows, and sometimes the whole time series. Only during the interval where ashock can be seen on the left, LPA’s choice of homogeneous window on the rightrestricts the time window to a few years (for example between 1994 and 2001).We see that LPA can handle time-varying parameters while facing a trade-offbetween parameter variability and modelling bias. A similar conclusion can be23rawn from the energy markets plots in (3) of figure (6).Next, we forecast the number of M&As for 02-2020 to 04-2020 using theLPA estimate and the fixed 1-year and 3-year window estimates of 01-2020.The estimates and mean squared errors (MSE) of estimation are summarized inthe table (1). Moreover, apart from the selection time-varying window lengths,we propose that LPA can also provide guidance on a priori selection of fixedwindows. Looking at the distribution plot of window lengths (as a proportionof data points in the time series before the time of evaluation) in figure (7) andselecting the window with highest frequency ( w ) as the fixed time window foreach time series, we also make a forecast and report the MSE in the same table.Financials Telecom. EnergyLPA estimate 52.99 3.43 4.2312 month MA estimate 55.58 2.42 3.0836 month MA estimate 55.50 2.94 4.17Most recurring window(w) 160 184 309w month MA estimate 55.25 2.90 4.62LPA MSE 1008.24 3.11 8.6112 month MSE 1159.29 7.34 3.2936 month MSE 1154.25 4.89 8.25w month MSE 1139.23 5.06 11.03Table 1: Forecast results for 02-2020 to 04-2020 based on the adaptively se-lected MLE, fixed windows and most recurring window proportionally ( w ) inthe distribution plots in figure (7) (1) (2) (3) Figure 7: Distribution of estimated interval length as a proportion of datapoints for (1) Financials, (2) Telecommunication and (3) Energy.LPA Empiricalstudy24he table shows that LPA produces the smallest MSE in the financials andtelecommunication industry. However, in the energy sector the 1 year fixedwindow outperforms LPA, perhaps due to its high fluctuations. Similarly, thehomogeneous time window recommended by LPA for the financial industry wasroughly 160 months. Choosing this time window resulted in a lower MSE thanusing 12 and 36 months fixed window estimates. For the energy sector, however,LPA recommends the selection of a very long window, for which the estimatedeviates significantly from the true value, but still captures the stationary as-pect of the time series. As a whole, the results show that wave-like patternsand level shifts are accurately detected, and that a plausible fit is achieved un-der the assumption that M&A follow a Poisson distribution with time-varyingparameter.The analysis shows that the proposed technique can be applied to real-worldfinancial data and gives additional insights based merely on an automatic proce-dure. It can be used as a baseline for forecasting or simulation approaches. TheMLE series could be forecasted using e.g. autoregressive models that generatedifferent scenarios. Business experts could then make a judgement about thelikeliness of these scenarios. Moreover, the availability of a time-varying pa-rameter allows analysts to use more sophisticated approaches, e.g. forecastingdensities of various time-series that are assumed to follow a Poisson distribution,such as the examples mentioned in the introduction (call centers, sales, supplychains). Businesses are often not interested in exact values, but in having anoverview over the range of expected figures, and forecasts of low quality can leadto costly miscalculations. The proposed procedure could help increase forecastquality and availability and thus generate business value if employed.25
Conclusion
A new algorithm for automatically finding homogeneous time intervals in a nonstationary context is proposed. Trends, cyclical components, and even blackswan events can be analysed with this technique. The algorithm is based on afruitful combination of a local parametric model (here a Poisson process) andMBS. We conduct a simulation study that indicates that the procedure is indeedrobust, even if the input data is not Poisson distributed. These results are thenused to evaluate a data set of M&A industry-based in Germany. The resultsshow that in this non-parametric model-free context we obtain interpretablestructural breaks and changes in trend. In conclusion, we provide an easy-touse solution for analyzing large data collections automatically that addressesthe most common problems in time series modeling and can be adapted todiverse applications, including call center arrivals, sales forecasting or supplychain decisions.
This work is subject to several limitations that are caused either by computa-tional cost or additional complexity that exceeds the scope of this work. Asthe proposed algorithm is computationally expensive, we evaluated only threeGerman industries and admit that this choice is, to an extend, arbitrarily taken.However, data for all industries in the US and Germany is available and couldbe used. The simulations showed the the proposed method had difficulties withdetecting small changes. We also saw that the method is not in all cases bet-ter than choosing a simple approach using moving averages. An evaluationbased on all 20 datasets could give further insights on the robustness of themethod. Furthermore, the evaluation using a pseudo-out-of-sample point fore-26ast is fairly limited and we justify the superiority of our approach based onthe simple measure of a mean-squared-error. However, as Diebold (2015) out-lines, this approach provides no insurance against over-fitting, are costly as theydiscard data and it is questionable whether they provide any benefit. Anotherquestionable point is whether the occurrence of M&A per month strictly followsa time-varying Poisson process, as assumed in this paper.Answering this question using a sophisticated evaluation approach throughgenerating (multi-period) density forecasts would give more insights and couldbe an interesting area for future research. Future research could also evaluatethe fit to a Poisson distribution with time-varying MLE, but with respect to dif-ferent values of n and c. Alternatively, other methods for increasing the testedhomogeneous interval could be compared to a geometric approach, e.g. randomincreases that follow a Poisson process or arithmetic increases. Furthermore,the simulation study indicated that the procedure could be generalized to workwith any data that is assumed to follow a distribution of the exponential family.Finally, a possible extension could include a more general version of the currenttest statistic that looks into both directions. References
K. R. Ahern and J. Harford. The Importance of IndustryLinks in Merger Waves.
The Journal of Finance , 69(2):527–576, 2014. ISSN 1540-6261. doi: 10.1111/jofi.12122. eprint:https://onlinelibrary.wiley.com/doi/pdf/10.1111/jofi.12122.O. Akkus, J. A. Cookson, and A. Horta¸csu. The Determinants of Bank Mergers:A Revealed Preference Analysis.
Management Science , 62(8):2241–2258, Nov.2015. ISSN 0025-1909. doi: 10.1287/mnsc.2015.2245. Publisher: INFORMS.27. Aminikhanghahi and D. J. Cook. A Survey of Methods for Time SeriesChange Point Detection.
Knowledge and information systems , 51(2):339–367,May 2017. ISSN 0219-1377. doi: 10.1007/s10115-016-0987-z.D. W. K. Andrews and W. Ploberger. Optimal Tests when a Nuisance Param-eter is Present Only Under the Alternative.
Econometrica , 62(6):1383–1414,1994. ISSN 0012-9682. doi: 10.2307/2951753. Publisher: [Wiley, EconometricSociety].S. Arlot, G. Blanchard, and E. Roquain. Some nonasymptotic results on re-sampling in high dimension, I: Confidence regions.
Annals of Statistics , 38(1):51–82, Feb. 2010. ISSN 0090-5364, 2168-8966. doi: 10.1214/08-AOS667.Publisher: Institute of Mathematical Statistics.R. Beran. Discussion: Jackknife, Bootstrap and Other Resampling Methods inRegression Analysis.
Annals of Statistics , 14(4):1295–1298, Dec. 1986. ISSN0090-5364, 2168-8966. doi: 10.1214/aos/1176350143. Publisher: Institute ofMathematical Statistics.A. B¨ucher and H. Dette. Multiplier bootstrap of tail copulas with applications.
Bernoulli , 19(5A):1655–1687, Nov. 2013. ISSN 1350-7265. doi: 10.3150/12-BEJ425. Publisher: Bernoulli Society for Mathematical Statistics andProbability.S. Chatterjee and A. Bose. Generalized bootstrap for estimating equations.
An-nals of Statistics , 33(1):414–436, Feb. 2005. ISSN 0090-5364, 2168-8966. doi:10.1214/009053604000000904. Publisher: Institute of Mathematical Statis-tics.J. Chen and A. K. Gupta. Change point analysis of a Gaussian model.
Sta-tistical Papers , 40(3):323–333, Sept. 1999. ISSN 1613-9798. doi: 10.1007/BF02929878. 28. Chen and A. K. Gupta.
Parametric Statistical Change Point Analysis: WithApplications to Genetics, Medicine, and Finance . Springer Science & BusinessMedia, Nov. 2011. ISBN 978-0-8176-4801-5. Google-Books-ID: mwzCuRMU-VLIC.V. Chernozhukov, D. Chetverikov, and K. Kato. Gaussian approximations andmultiplier bootstrap for maxima of sums of high-dimensional random vectors.
Annals of Statistics , 41(6):2786–2819, Dec. 2013. ISSN 0090-5364, 2168-8966.doi: 10.1214/13-AOS1161. Publisher: Institute of Mathematical Statistics.F. X. Diebold. Comparing Predictive Accuracy, Twenty Years Later: A PersonalPerspective on the Use and Abuse of Diebold–Mariano Tests.
Journal ofBusiness & Economic Statistics , 33(1):1–1, Jan. 2015. ISSN 0735-0015. doi:10.1080/07350015.2014.983236. Publisher: Taylor & Francis.F. X. Diebold, T. A. Gunther, and A. S. Tay. Evaluating Density Forecasts withApplications to Financial Risk Management.
International Economic Review
Journal of Econometrics , 177(2):153–170, Dec. 2013. ISSN 0304-4076. doi: 10.1016/j.jeconom.2013.04.003.G. Gonzalez-Rivera and Y. Sun. Density Forecast Evaluation in Unstable En-vironments. Technical Report 201428, University of California at Riverside,Department of Economics, Aug. 2014. Publication Title: Working Papers.29. Haccou, E. Meelis, and S. van de Geer. The likelihood ratio test for the changepoint problem for exponentially distributed random variables.
Stochastic Pro-cesses and their Applications , 27:121–139, Jan. 1987. ISSN 0304-4149. doi:10.1016/0304-4149(87)90009-3.W. K. H¨ardle and E. Mammen. Comparing Nonparametric Versus Paramet-ric Regression Fits.
Annals of Statistics , 21(4):1926–1947, Dec. 1993. ISSN0090-5364, 2168-8966. doi: 10.1214/aos/1176349403. Publisher: Institute ofMathematical Statistics.W. K. H¨ardle, A. Mihoci, and C. Hian-Ann Ting. Adaptive Order Flow Fore-casting with Multiplicative Error Models. SSRN Scholarly Paper ID 2892620,Social Science Research Network, Rochester, NY, July 2014.W. K. H¨ardle, N. Hautsch, and A. Mihoci. Local Adaptive MultiplicativeError Models for High-Frequency Forecasts.
Journal of Applied Economet-rics , 30(4):529–550, 2015. ISSN 1099-1255. doi: 10.1002/jae.2376. eprint:https://onlinelibrary.wiley.com/doi/pdf/10.1002/jae.2376.J. Harford. What drives merger waves?
Journal of Financial Economics , 77(3):529–560, Sept. 2005. ISSN 0304-405X. doi: 10.1016/j.jfineco.2004.05.004.D. V. Hinkley and E. A. Hinkley. Inference About the Change-Point in a Se-quence of Binomial Variables.
Biometrika , 57(3):477–488, 1970. ISSN 0006-3444. doi: 10.2307/2334766. Publisher: [Oxford University Press, BiometrikaTrust].D. A. Hsu. Detecting Shifts of Parameter in Gamma Sequences with Ap-plications to Stock Price and Air Traffic Flow Analysis.
Journal of theAmerican Statistical Association
Journal of Econometrics , 196(1):55–67, Jan. 2017. ISSN 0304-4076. doi: 10.1016/j.jeconom.2016.03.006.Y. Klochkov, W. K. H¨ardle, and X. Xu. Localizing Multivariate CAViaR.
IRTG1792 Discussion paper , page 48, 2019.C. Kolsarici and D. Vakratsas. Correcting for Misspecification in ParameterDynamics to Improve Forecast Accuracy with Adaptively Estimated Models.
Management Science , 61(10):2495–2513, Jan. 2015. ISSN 0025-1909. doi:10.1287/mnsc.2014.2027. Publisher: INFORMS.Y. A. Kutoyants and V. Spokoiny. Optimal choice of observation window forPoisson observations.
Statistics & Probability Letters , 44(3):291–298, Sept.1999. ISSN 0167-7152. doi: 10.1016/S0167-7152(99)00020-6.V. Maksimovic, G. Phillips, and L. Yang. Private and Public Merger Waves.
TheJournal of Finance , 68(5):2177–2217, 2013. ISSN 1540-6261. doi: 10.1111/jofi.12055. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/jofi.12055.E. Mammen. Bootstrap and Wild Bootstrap for High Dimensional Linear Mod-els.
Annals of Statistics , 21(1):255–285, Mar. 1993. ISSN 0090-5364, 2168-8966. doi: 10.1214/aos/1176349025. Publisher: Institute of MathematicalStatistics.M. Martynova and L. Renneboog. A Century of Corporate Takeovers: WhatHave We Learned and Where Do We Stand? (previous title: The History ofM&A Activity Around the World: A Survey of Literature). SSRN ScholarlyPaper ID 820984, Social Science Research Network, Rochester, NY, Oct. 2005.D. Mercurio and V. Spokoiny. Statistical inference for time-inhomogeneousvolatility models.
Annals of Statistics , 32(2):577–602, Apr. 2004. ISSN 0090-31364, 2168-8966. doi: 10.1214/009053604000000102. Publisher: Institute ofMathematical Statistics.B. N. Oreshkin, N. R´eegnard, and P. L’Ecuyer. Rate-Based Daily Arrival Pro-cess Models with Application to Call Centers.
Operations Research , 64(2):510–527, Mar. 2016. ISSN 0030-364X. doi: 10.1287/opre.2016.1484. Pub-lisher: INFORMS.M. H. Pesaran, A. Pick, and M. Pranovich. Optimal forecasts in the presence ofstructural breaks.
Journal of Econometrics , 177(2):134–152, Dec. 2013. ISSN0304-4076. doi: 10.1016/j.jeconom.2013.04.002.V. Spokoiny. Multiscale local change point detection with applications to value-at-risk.
The Annals of Statistics , 37(3):1405–1436, June 2009. ISSN 0090-5364, 2168-8966. doi: 10.1214/08-AOS612. Publisher: Institute of Mathe-matical Statistics.V. Spokoiny and M. Zhilova. Bootstrap confidence sets under model misspec-ification.
Annals of Statistics , 43(6):2653–2675, Dec. 2015. ISSN 0090-5364,2168-8966. doi: 10.1214/15-AOS1355. Publisher: Institute of MathematicalStatistics.V. G. Spokoiny. Estimation of a function with discontinuities via local polyno-mial fit with an adaptive window choice.
The Annals of Statistics , 26(4):1356–1378, Aug. 1998. ISSN 0090-5364, 2168-8966. doi: 10.1214/aos/1024691246.Publisher: Institute of Mathematical Statistics.J. W. Taylor. A Comparison of Univariate Time Series Methods for ForecastingIntraday Arrivals at a Call Center.
Management Science , 54(2):253–265, Dec.2007. ISSN 0025-1909. doi: 10.1287/mnsc.1070.0786. Publisher: INFORMS.J. W. Taylor. Density Forecasting of Intraday Call Center Arrivals Using Models32ased on Exponential Smoothing.
Management Science , 58(3):534–549, Oct.2011. ISSN 0025-1909. doi: 10.1287/mnsc.1110.1434. Publisher: INFORMS.P. Very, E. Metais, S. Lo, and P.-G. Hourquet. Can We Predict M&A Ac-tivity? In S. Finkelstein and C. L. Cooper, editors,
Advances in Mergersand Acquisitions , volume 11 of
Advances in Mergers & Acquisitions , pages 1–32. Emerald Group Publishing Limited, Jan. 2012. ISBN 978-1-78190-460-2978-1-78190-459-6. doi: 10.1108/S1479-361X(2012)0000011004.C. F. J. Wu. Jackknife, Bootstrap and Other Resampling Methods in RegressionAnalysis.
Annals of Statistics , 14(4):1261–1295, Dec. 1986. ISSN 0090-5364,2168-8966. doi: 10.1214/aos/1176350142. Publisher: Institute of Mathemat-ical Statistics.P. M. Yelland, S. Kim, and R. Stratulate. A Bayesian Model for Sales Forecast-ing at Sun Microsystems.