[PDF] Inferring serial correlation with dynamic backgrounds

Abstract

Sequential data with serial correlation and an unknown, unstructured, and dynamic background is ubiquitous in neuroscience, psychology, and econometrics. Inferring serial correlation for such data is a fundamental challenge in statistics. We propose a total variation constrained least square estimator coupled with hypothesis tests to infer the serial correlation in the presence of unknown and unstructured dynamic background. The total variation constraint on the dynamic background encourages a piece-wise constant structure, which can approximate a wide range of dynamic backgrounds. The tuning parameter is selected via the Ljung-Box test to control the bias-variance trade-off. We establish a non-asymptotic upper bound for the estimation error through variational inequalities. We also derive a lower error bound via Fano's method and show the proposed method is near-optimal. Numerical simulation and a real study in psychology demonstrate the excellent performance of our proposed method compared with the state-of-the-art.

Full PDF

IInferring serial correlation with dynamic backgrounds

Song Wei a Yao Xie a Dobromir Rahnev ba School of Industrial and Systems Engineering, b School of Psychology,Georgia Institute of Technology, Atlanta, Georgia,30332-0205, U.S.A.February 2, 2021

Abstract

Sequential data with serial correlation and an unknown, unstructured, and dynamicbackground is ubiquitous in neuroscience, psychology, and econometrics. Inferring serialcorrelation for such data is a fundamental challenge in statistics. We propose a totalvariation constrained least square estimator coupled with hypothesis tests to infer the serialcorrelation in the presence of unknown and unstructured dynamic background. The totalvariation constraint on the dynamic background encourages a piece-wise constant structure,which can approximate a wide range of dynamic backgrounds. The tuning parameteris selected via the Ljung-Box test to control the bias-variance trade-oﬀ. We establisha non-asymptotic upper bound for the estimation error through variational inequalities.We also derive a lower error bound via Fano’s method and show the proposed methodis near-optimal. Numerical simulation and a real study in psychology demonstrate theexcellent performance of our proposed method compared with the state-of-the-art.

Keywords:

Autoregressive time series; High-dimensional lasso; Non-stationarity; Total variationconstraint; Variational inequality.

Serial correlation and serial dependence have been central to time series analysis [Hong, 2010].Modern time-series data from neuroscience, psychology, and economics usually contain botha substantial serial dependence and a non-stationary drift [Akrami et al., 2018, Wexler et al.,2015, Moskowitz et al., 2012, Fischer and Whitney, 2014, Cicchini et al., 2018, McIlhagga,2008, Rahnev et al., 2015]. A well-known example comes from human reaction times, whichare thought to be autocorrelated but also drift throughout an experiment [Laming, 1968].The drift can be due to many factors such as becoming better on the task, increased tiredness,and attention or arousal ﬂuctuations. None of these inﬂuences take a speciﬁc parametric form.While some (e.g., learning or fatigue) are likely to be monotonic, others (e.g., ﬂuctuations inattention) can be expected to waver unpredictably. This non-stationary background drift isthus typically considered a nuisance variable.1 a r X i v : . [ m a t h . S T ] J a n t is typically of strong scientiﬁc interest to infer the presence and/or assess serial correla-tion’s strength with an unknown and unstructured dynamic background. The magnitude ofautocorrelation has direct implications for many scientiﬁc theories. For example, Fischer andWhitney [2014] proposed that the human brain creates a “perceptual continuity ﬁeld” wherethe subjective percept at one point of time directly inﬂuences the percept within a subsequent15-second window. Such eﬀects are known as “serial dependence” and are an active area ofresearch within psychology and neuroscience. Progress in this and related endeavors dependson one’s ability to estimate the magnitude of autocorrelation in certain time series, even inthe presence of substantial unstructured drift.The most popular tool to handle the serial correlation is the autoregressive time series.However, the presence of even a small drift can induce strong biases in the autoregressivecoeﬃcients. For example, unmodeled background drift can masquerade as autocorrelation,as illustrated in the ﬁrst panel in Figure 1. This issue has been pointed out before byDutilh et al. [2012], but no solution exists to date. Techniques have been developed fortracking the unknown dynamic background with minimum structural assumptions [Hodrickand Prescott, 1997, Kim et al., 2009, Harchaoui and Lévy-Leduc, 2010] but these approachesdo not estimate the serial correlation. Thus, we currently lack an eﬃcient method to capturethe autocorrelation strength in a time series in the presence of highly unstructured dynamicdrifts.

Fitted AR(1) modelFitted backgroundObservationTrue background = 0.11351, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.100.10.20.30.40.50.6 0 200 400 600 800 1000 1200 1400 1600 1800 2000-1-0.500.511.52 = 0.33307, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.100.10.20.30.40.50.6 0 200 400 600 800 1000 1200 1400 1600 1800 2000-1-0.500.511.52 = 0.059363, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.200.20.40.60.8 TT TT TT = 0 , ˆ ↵ = 0 . AAACGnicbVDLSgNBEJz1GeMr6tHLYBA8SNjNCnoRAl48RjAPyIbQOztJhsw+mOkVwpLv8OKvePGgiDfx4t84m+SgiQXDFFXddHf5iRQabfvbWlldW9/YLGwVt3d29/ZLB4dNHaeK8QaLZazaPmguRcQbKFDydqI4hL7kLX90k/utB660iKN7HCe8G8IgEn3BAI3UKzmeH8tAj0PzZV7AJQK9pvY59ag3BKQeyGQIPScXK67rOpNeqWxX7CnoMnHmpEzmqPdKn14QszTkETIJWnccO8FuBgoFk3xS9FLNE2AjGPCOoRGEXHez6WkTemqUgPZjZV6EdKr+7sgg1Pn2pjIEHOpFLxf/8zop9q+6mYiSFHnEZoP6qaQY0zwnGgjFGcqxIcCUMLtSNgQFDE2aRROCs3jyMmlWK45bqd5dlGvVeRwFckxOyBlxyCWpkVtSJw3CyCN5Jq/kzXqyXqx362NWumLNe47IH1hfPw3Enj8= = 6 . , ˆ ↵ = 0 . AAACHHicbVDJSgNBEO1xjXGLevTSGAQPEmaSuB2EgBePEcwCmRBqejpJk56F7hohDPkQL/6KFw+KePEg+Dd2loMmPmj68V4VVfW8WAqNtv1tLS2vrK6tZzaym1vbO7u5vf26jhLFeI1FMlJNDzSXIuQ1FCh5M1YcAk/yhje4GfuNB660iMJ7HMa8HUAvFF3BAI3UyZVcL5K+HgbmS12fSwR6Tc8L9il1qdsHpC7IuA8dx8h2wT67Ko86ubxhE9BF4sxInsxQ7eQ+XT9iScBDZBK0bjl2jO0UFAom+SjrJprHwAbQ4y1DQwi4bqeT40b02Cg+7UbKvBDpRP3dkUKgx/ubygCwr+e9sfif10qwe9lORRgnyEM2HdRNJMWIjpOivlCcoRwaAkwJsytlfVDA0OSZNSE48ycvknqx4JQKxbtyvlKcxZEhh+SInBCHXJAKuSVVUiOMPJJn8krerCfrxXq3PqalS9as54D8gfX1Axdcnr8= = 0 . , ˆ ↵ = 0 . AAACHXicbVDLSgMxFM3UV62vqks3wSK4kDLTB7oRCm5cVrAP6JRyJ5O2oZkHyR2hDP0RN/6KGxeKuHAj/o3pY6GtB0IO59xLco4XS6HRtr+tzNr6xuZWdju3s7u3f5A/PGrqKFGMN1gkI9X2QHMpQt5AgZK3Y8Uh8CRveaObqd964EqLKLzHccy7AQxC0RcM0Ei9fMX1IunrcWCu1PW5RKDX1C5WKxfUpe4QkLog4yH0nJnuOOXqpJcv2EV7BrpKnAUpkAXqvfyn60csCXiITILWHceOsZuCQsEkn+TcRPMY2AgGvGNoCAHX3XSWbkLPjOLTfqTMCZHO1N8bKQR6GsBMBoBDvexNxf+8ToL9q24qwjhBHrL5Q/1EUozotCrqC8UZyrEhwJQwf6VsCAoYmkJzpgRnOfIqaZaKTrlYuqsUaqVFHVlyQk7JOXHIJamRW1InDcLII3kmr+TNerJerHfrYz6asRY7x+QPrK8fi7We9A== Fitted AR(1) modelFitted backgroundObservationTrue background = 0.11351, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.100.10.20.30.40.50.6 0 200 400 600 800 1000 1200 1400 1600 1800 2000-1-0.500.511.52 = 0.33307, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.100.10.20.30.40.50.6 0 200 400 600 800 1000 1200 1400 1600 1800 2000-1-0.500.511.52 = 0.059363, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.200.20.40.60.8 TT TT TT = 0 , ˆ ↵ = 0 . AAACGnicbVDLSgNBEJz1GeMr6tHLYBA8SNjNCnoRAl48RjAPyIbQOztJhsw+mOkVwpLv8OKvePGgiDfx4t84m+SgiQXDFFXddHf5iRQabfvbWlldW9/YLGwVt3d29/ZLB4dNHaeK8QaLZazaPmguRcQbKFDydqI4hL7kLX90k/utB660iKN7HCe8G8IgEn3BAI3UKzmeH8tAj0PzZV7AJQK9pvY59ag3BKQeyGQIPScXK67rOpNeqWxX7CnoMnHmpEzmqPdKn14QszTkETIJWnccO8FuBgoFk3xS9FLNE2AjGPCOoRGEXHez6WkTemqUgPZjZV6EdKr+7sgg1Pn2pjIEHOpFLxf/8zop9q+6mYiSFHnEZoP6qaQY0zwnGgjFGcqxIcCUMLtSNgQFDE2aRROCs3jyMmlWK45bqd5dlGvVeRwFckxOyBlxyCWpkVtSJw3CyCN5Jq/kzXqyXqx362NWumLNe47IH1hfPw3Enj8= = 6 . , ˆ ↵ = 0 . AAACHHicbVDJSgNBEO1xjXGLevTSGAQPEmaSuB2EgBePEcwCmRBqejpJk56F7hohDPkQL/6KFw+KePEg+Dd2loMmPmj68V4VVfW8WAqNtv1tLS2vrK6tZzaym1vbO7u5vf26jhLFeI1FMlJNDzSXIuQ1FCh5M1YcAk/yhje4GfuNB660iMJ7HMa8HUAvFF3BAI3UyZVcL5K+HgbmS12fSwR6Tc8L9il1qdsHpC7IuA8dx8h2wT67Ko86ubxhE9BF4sxInsxQ7eQ+XT9iScBDZBK0bjl2jO0UFAom+SjrJprHwAbQ4y1DQwi4bqeT40b02Cg+7UbKvBDpRP3dkUKgx/ubygCwr+e9sfif10qwe9lORRgnyEM2HdRNJMWIjpOivlCcoRwaAkwJsytlfVDA0OSZNSE48ycvknqx4JQKxbtyvlKcxZEhh+SInBCHXJAKuSVVUiOMPJJn8krerCfrxXq3PqalS9as54D8gfX1Axdcnr8= = 0 . , ˆ ↵ = 0 . AAACHXicbVDLSgMxFM3UV62vqks3wSK4kDLTB7oRCm5cVrAP6JRyJ5O2oZkHyR2hDP0RN/6KGxeKuHAj/o3pY6GtB0IO59xLco4XS6HRtr+tzNr6xuZWdju3s7u3f5A/PGrqKFGMN1gkI9X2QHMpQt5AgZK3Y8Uh8CRveaObqd964EqLKLzHccy7AQxC0RcM0Ei9fMX1IunrcWCu1PW5RKDX1C5WKxfUpe4QkLog4yH0nJnuOOXqpJcv2EV7BrpKnAUpkAXqvfyn60csCXiITILWHceOsZuCQsEkn+TcRPMY2AgGvGNoCAHX3XSWbkLPjOLTfqTMCZHO1N8bKQR6GsBMBoBDvexNxf+8ToL9q24qwjhBHrL5Q/1EUozotCrqC8UZyrEhwJQwf6VsCAoYmkJzpgRnOfIqaZaKTrlYuqsUaqVFHVlyQk7JOXHIJamRW1InDcLII3kmr+TNerJerHfrYz6asRY7x+QPrK8fi7We9A== Fitted AR(1) modelFitted backgroundObservationTrue background = 0.11351, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.100.10.20.30.40.50.6 0 200 400 600 800 1000 1200 1400 1600 1800 2000-1-0.500.511.52 = 0.33307, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.100.10.20.30.40.50.6 0 200 400 600 800 1000 1200 1400 1600 1800 2000-1-0.500.511.52 = 0.059363, = 3.5881, p = 0.77083, = 0.000270510 200 400 600 800 1000 1200 1400 1600 1800 2000-0.200.20.40.60.8 TT TT TT = 0 , ˆ ↵ = 0 . AAACGnicbVDLSgNBEJz1GeMr6tHLYBA8SNjNCnoRAl48RjAPyIbQOztJhsw+mOkVwpLv8OKvePGgiDfx4t84m+SgiQXDFFXddHf5iRQabfvbWlldW9/YLGwVt3d29/ZLB4dNHaeK8QaLZazaPmguRcQbKFDydqI4hL7kLX90k/utB660iKN7HCe8G8IgEn3BAI3UKzmeH8tAj0PzZV7AJQK9pvY59ag3BKQeyGQIPScXK67rOpNeqWxX7CnoMnHmpEzmqPdKn14QszTkETIJWnccO8FuBgoFk3xS9FLNE2AjGPCOoRGEXHez6WkTemqUgPZjZV6EdKr+7sgg1Pn2pjIEHOpFLxf/8zop9q+6mYiSFHnEZoP6qaQY0zwnGgjFGcqxIcCUMLtSNgQFDE2aRROCs3jyMmlWK45bqd5dlGvVeRwFckxOyBlxyCWpkVtSJw3CyCN5Jq/kzXqyXqx362NWumLNe47IH1hfPw3Enj8= = 6 . , ˆ ↵ = 0 . AAACHHicbVDJSgNBEO1xjXGLevTSGAQPEmaSuB2EgBePEcwCmRBqejpJk56F7hohDPkQL/6KFw+KePEg+Dd2loMmPmj68V4VVfW8WAqNtv1tLS2vrK6tZzaym1vbO7u5vf26jhLFeI1FMlJNDzSXIuQ1FCh5M1YcAk/yhje4GfuNB660iMJ7HMa8HUAvFF3BAI3UyZVcL5K+HgbmS12fSwR6Tc8L9il1qdsHpC7IuA8dx8h2wT67Ko86ubxhE9BF4sxInsxQ7eQ+XT9iScBDZBK0bjl2jO0UFAom+SjrJprHwAbQ4y1DQwi4bqeT40b02Cg+7UbKvBDpRP3dkUKgx/ubygCwr+e9sfif10qwe9lORRgnyEM2HdRNJMWIjpOivlCcoRwaAkwJsytlfVDA0OSZNSE48ycvknqx4JQKxbtyvlKcxZEhh+SInBCHXJAKuSVVUiOMPJJn8krerCfrxXq3PqalS9as54D8gfX1Axdcnr8= = 0 . , ˆ ↵ = 0 . AAACHXicbVDLSgMxFM3UV62vqks3wSK4kDLTB7oRCm5cVrAP6JRyJ5O2oZkHyR2hDP0RN/6KGxeKuHAj/o3pY6GtB0IO59xLco4XS6HRtr+tzNr6xuZWdju3s7u3f5A/PGrqKFGMN1gkI9X2QHMpQt5AgZK3Y8Uh8CRveaObqd964EqLKLzHccy7AQxC0RcM0Ei9fMX1IunrcWCu1PW5RKDX1C5WKxfUpe4QkLog4yH0nJnuOOXqpJcv2EV7BrpKnAUpkAXqvfyn60csCXiITILWHceOsZuCQsEkn+TcRPMY2AgGvGNoCAHX3XSWbkLPjOLTfqTMCZHO1N8bKQR6GsBMBoBDvexNxf+8ToL9q24qwjhBHrL5Q/1EUozotCrqC8UZyrEhwJQwf6VsCAoYmkJzpgRnOfIqaZaKTrlYuqsUaqVFHVlyQk7JOXHIJamRW1InDcLII3kmr+TNerJerHfrYz6asRY7x+QPrK8fi7We9A== Fitted AR(1) modelFitted backgroundObservationTrue background

Figure 1: An example showing that proper modeling of dynamic background is important incapturing serial correlation. The estimate (cid:98) α is speciﬁed on the top of each column, with theground truth α = 0 . ; δ is a hyperparameter that controls the data ﬁt and model complexity.In the ﬁrst panel, we directly ﬁt an ar (1) model while ignoring the dynamic background,leading to over-estimating (cid:98) α . In the third panel, the result overﬁts the dynamic background,leading to underestimating the autoregressive coeﬃcient. The second panel is the desiredresult obtained by our method.Motivated by this, we consider the following problem. Assume a sequence of observations x , . . . , x T over time horizon T , which are generated from the underlying non-stationary ar ( p ) time series model: x i = f i + p (cid:88) j =1 α j x i − j + ε i , i = 1 , . . . , T, (1)2here ε , . . . , ε T are i.i.d. sub-Gaussian random noise with zero mean and variance σ , α , . . . , α p are autoregressive coeﬃcients, f , . . . , f T are deterministic dynamic backgroundand x − p +1 , . . . , x are the known history. The goal is to infer the presence and/or estimatethe unknown autoregressive coeﬃcients and dynamic background simultaneously from data. To ensure our model is general, we do not impose parametric or distributional assumptionson the dynamic background f i ’s. In this paper, we present a new convex optimization based method to estimate theautoregressive coeﬃcients for sequential data in the presence of unknown dynamic background,coupled with the Ljung-Box test for model diagnosis. We cast the problem as minimizingthe least square error with a total-variation constraint on the dynamic background, whichencourages a piecewise constant structure and can approximate a wide range of unstructureddrifts with good precision. We establish performance guarantees for the (cid:96) recovery error ofthe coeﬃcients. To eﬃciently tune hyperparameters to control the bias-variance trade-oﬀ,we adopt the Ljung-Box test [Ljung and Box, 1978]. Extensive numerical experiments areperformed to validate the eﬀectiveness of the proposed method. We also test our method ona real psychology dataset to demonstrate it can infer whether or not there is a statisticallysigniﬁcant correlation.The rest of the paper is organized as follows. In the remainder of this section, wediscuss related works. Section 2 presents the proposed method. Section 3 contains the maintheoretical results, including a non-asymptotic bounds on the (cid:96) estimation error for the ar (1) model, for the ease of presentation. We discuss how to extend the result to ar ( p ) modelsin Section 4. Section 5 contains simulation results to demonstrate the good performance ofour method and validates theoretical results. Section 6 presents a real-data study from apsychology experiment. Finally, Section 7 summaries the paper. Standard time series models [Brockwell et al., 1991] such as autoregressive and moving averagemodels do not include dynamic backgrounds. On the other hand, the classical approach tocapture dynamic background usually makes strong structural assumptions such as the lineartrend, periodical trend Clark [1987] or hidden Markov model Hamilton [1989]. Our probleminvolves a highly unstructured background. This requires new solution approach; moreover,existing theory does not apply because the unstructured dynamic background leads to anon-stationary time series, which does not satisfy the strong-mixing condition. This disablesus from using asymptotic results in the classic time series literature.Recent works for similar problems also use convex optimization to ﬁt the dynamicbackground while making few structural assumptions. This line of work typically considerssolving a least square problem with various penalties or constraints to encourage desiredstructures on the ﬁtted background, which can approximate the unknown ground-truth. Forinstance, H-P ﬁlter [Hodrick and Prescott, 1997] imposed (cid:96) penalty on the second-orderdiﬀerence to encourage a smooth background; Kim et al. [2009] considered a variant of H-Pﬁlter with an (cid:96) regularization function to capture a piece-wise linear background. Anotherrelated work [Harchaoui and Lévy-Leduc, 2010] considered change-point detection in themeans using least square estimation with total variation penalty; since the number of changepoints is unknown, the work essentially estimates a piece-wise constant background. While3any advances have been achieved, these existing works have not considered serial correlationtogether with the dynamic background.Our proposed method is related to variable fusion [Land and Friedman, 1997] and fusedlasso [Tibshirani et al., 2005]. Here the unstructured, dynamic background leads to a high-dimensional problem: we have T equations and T + p variables; the optimal solution isnot unique. Thus, we borrow the analytical technique in analyzing high-dimensional lasso,particularly the restricted eigenvalue conditions for the design matrix [Bickel et al., 2009,Meinshausen and Yu, 2009, Van De Geer and Bühlmann, 2009] to derive the theoreticalresults, while further exploiting the special structure of our design matrix.There are two closely related recent works: Xu [2008] used polynomials to approximatethe dynamic background, and Zhang et al. [2020] developed an online forecasting algorithmbased on least square estimation with (cid:96) variable fusion constraint. These works do notexplicitly consider highly unstructured backgrounds. We compare with both methods vianumerical simulations in Section 5 and show the advantage of our approach when there aredynamic, unstructured backgrounds; moreover, we also present a method for hyperparameterselection based on the Ljung-Box test. Consider a total variation constrained least square estimator to estimate the autoregressivecoeﬃcients and the dynamic background simultaneously, which is obtained by solving thefollowing convex optimization problem:minimize α ...,α p ,f ,...,f T T (cid:80) Ti =1 (cid:16) x i − (cid:80) pj =1 α j x i − j − f i (cid:17) subject to (cid:80) T − i =1 | f i +1 − f i | < δ, (2)where δ is a user-speciﬁed hyperparameter (the selection of δ is discussed in Section 2.2).As discussed for the problem formulation (1), diﬀerent from the conventional autoregressivemodel, here we consider an unknown and time-varying background . Since the number ofobservations and the number of parameters both grow at the same rate as the time horizon T increases, we cannot uniquely recover the parameters using the available observations. Thus,we impose a total variation constraint on the dynamic background, essentially choosing onesolution with the smallest variations. Such an approach can serve as a good approximationto a broad class of unstructured, dynamic backgrounds. We will show that the choice of the hyper-parameter δ in (2) will critically impact its solution(the recovered dynamic background and the AR coeﬃcient). As illustrated in the ﬁrst panelin Figure 1, setting δ = 0 will result in a very simple ar (1) model, but a very biased estimate (cid:98) α . Clearly, this model under-ﬁts data. On the other hand, the third panel in Figure 1 showsthat when δ is too large, the ﬁtted model will have a small empirical loss but overﬁtted4ackground, which still results in very biased (cid:98) α . From the second panel in Figure 1, we cansee the ﬁtted piecewise constant background faithfully captures the dynamics, and this modelyields a very accurate estimate (cid:98) α .Figure 1 illustrates that δ controls the bias-variance trade-oﬀ: a larger δ leads to a smallerﬁtting error, but an overﬁtted background, and thus the estimated AR coeﬃcients are biased.Therefore, we cannot use the ﬁtting error to tune δ . Instead, we choose δ by the Ljung-Boxtest, which can test the model’s goodness-of-ﬁt by checking the remaining serial correlationin the residual sequence. This test provides a p -value to quantify the goodness-of-ﬁt. Sincelarger p -value indicates less remaining serial correlation in the residuals, we select δ with themaximum p -value. Details of parameter tuning procedure can be found in Appendix A.Here we want to comment that we cannot use the popular cross-validation technique tochoose δ . The cross-validation splits the data into training data and testing data. Typically,the model for the training data and the test data are identical; thus, cross-validation errorcan be used to estimate the actual test error. However, here since our background is dynamicand diﬀerent on the test and the training data, we cannot apply model ﬁtted on trainingdata to the test data to tune hyperparameter. Finally, we present two bootstrap methods to construct conﬁdence intervals for the autore-gressive coeﬃcients. For serially correlated data, we cannot use the conventional bootstrapfor i.i.d. data [Efron, 1992] but instead using the following techniques: (i) Wild bootstrap[Wu, 1986], which resamples from the ﬁtted residuals and (ii) a variant [Künsch, 1989] oflocal moving block bootstrap, which is designed for non-stationary time series. Details ofboth methods can be found in Appendix A. ar (1) sequences Now we present the main theoretical results, including the upper and lower bounds for theparameter (cid:96) recovery errors using (2). We start with some necessary deﬁnitions. Deﬁne a sequence as being ε -recoverable , if the (cid:96) recovery error using (2) is smaller than ε . The collection of recoverable sequences forms the ε -recoverable region . We make the following assumptions to ensure the dynamic backgrounddoes not change too drastically. (i) The background contains at most s changes over the timehorizon T , i.e., it consists of at most s + 1 pieces. In other words, the rate-of-change for thedynamic background is on the order of s/T , and the change does not happen very often. (ii)The magnitude of the each change is upper-bounded by δ : | ∆ i | ≤ δ , i = 2 , . . . , T, (3)where ∆ i = f i − f i − , i = 2 , . . . , T are one-step changes of the dynamic background.5t is known that the least square estimator is consistent for any stationary autoregressivetime-series. Intuitively, the “smaller” the dynamic background, the “closer” the sequence is toits stationary counterpart. A fundamental question is What ranges of dynamic backgroundsand autoregressive coeﬃcient can be estimated accurately?

We answer this question viaTheorems 1 and 2, which establish the upper and lower bounds of the recovery error thatdepend on the number of changes s and the size of the change δ . However, in our setting, s ( T ) is non-decreasing with respect to the time horizon T . Thus, we cannot expect therecovery error to shrink to zero with increasing T as the usual asymptotic analysis.Theorem 1 establishes the suﬃcient condition to ensure the (cid:96) recovery error does notexceed ε : min (cid:110) (cid:101) C s / δ , (cid:112) vol( S ) /π (cid:111) + s / δ ≤ ε − δ := ε δ , (4)where (cid:101) C is a positive constant, S is a user-speciﬁed set that the true parameter resides inand vol( · ) denotes the volume of a set. Therefore, (4) is a suﬃcient condition that s and δ ofa ε -recoverable sequence needs to satisfy, and thus it deﬁnes the boundary for ε -recoverableregion as illustrated in Figure 2. The recoverable region is the union of the blue and the greenregions in Figure 2: the green region does not vary but the blue region shrinks with increasing vol( S ) and eventually vanishes once (cid:112) vol( S ) /π exceeds ε δ . This can be explained by thatwhen the unknown coeﬃcients reside in a larger S , it is more diﬃcult to recover the trueparameters for the same accuracy ε , which leads to a smaller recoverable region. Moreover,as illustrated in Figure 2, in the ε -recoverable region, there exists a collection of instances(the region shaded in red dashed lines) where the best achievable performance (lower bound)meets the upper bound of our proposed estimator over the ﬁnite time horizon.

20 40 60 8000.010.020.030.0420 40 60 8000.010.020.030.0420 40 60 8000.010.020.030.04 ✏ recoverable region AAACGXicbVDLSgMxFM34tr6qLt0Ei+DGMlMFXQpuXFawrdAZSia9U4OZZEjuiGXob7jxV9y4UMSlrvwb03YWvi6EHM69J7nnxJkUFn3/05uZnZtfWFxarqysrq1vVDe32lbnhkOLa6nNVcwsSKGghQIlXGUGWBpL6MQ3Z+N+5xaMFVpd4jCDKGUDJRLBGTqqV/XDWMu+HabuoiFkVkitDkKEO4yTwgDXTs3ca9TAwClGvWrNr/uTon9BUIIaKavZq76Hfc3zFBRyyaztBn6GUcEMCi5hVAlzCxnjN2wAXQcVS8FGxcTZiO45pk8TbdxRSCfsd0XBUjte3k2mDK/t796Y/K/XzTE5iQqhshxB8elHSS4pajqOifaF845y6ADjRrhdKb9mhnF0YVZcCMFvy39Bu1EPDuuNi6PaaaOMY4nskF2yTwJyTE7JOWmSFuHknjySZ/LiPXhP3qv3Nh2d8UrNNvlR3scXcFGhxA== ✏ recoverable region AAACGXicbVDLSgMxFM34tr6qLt0Ei+DGMlMFXQpuXFawrdAZSia9U4OZZEjuiGXob7jxV9y4UMSlrvwb03YWvi6EHM69J7nnxJkUFn3/05uZnZtfWFxarqysrq1vVDe32lbnhkOLa6nNVcwsSKGghQIlXGUGWBpL6MQ3Z+N+5xaMFVpd4jCDKGUDJRLBGTqqV/XDWMu+HabuoiFkVkitDkKEO4yTwgDXTs3ca9TAwClGvWrNr/uTon9BUIIaKavZq76Hfc3zFBRyyaztBn6GUcEMCi5hVAlzCxnjN2wAXQcVS8FGxcTZiO45pk8TbdxRSCfsd0XBUjte3k2mDK/t796Y/K/XzTE5iQqhshxB8elHSS4pajqOifaF845y6ADjRrhdKb9mhnF0YVZcCMFvy39Bu1EPDuuNi6PaaaOMY4nskF2yTwJyTE7JOWmSFuHknjySZ/LiPXhP3qv3Nh2d8UrNNvlR3scXcFGhxA== ✏ recoverable region AAACGXicbVDLSgMxFM34tr6qLt0Ei+DGMlMFXQpuXFawrdAZSia9U4OZZEjuiGXob7jxV9y4UMSlrvwb03YWvi6EHM69J7nnxJkUFn3/05uZnZtfWFxarqysrq1vVDe32lbnhkOLa6nNVcwsSKGghQIlXGUGWBpL6MQ3Z+N+5xaMFVpd4jCDKGUDJRLBGTqqV/XDWMu+HabuoiFkVkitDkKEO4yTwgDXTs3ca9TAwClGvWrNr/uTon9BUIIaKavZq76Hfc3zFBRyyaztBn6GUcEMCi5hVAlzCxnjN2wAXQcVS8FGxcTZiO45pk8TbdxRSCfsd0XBUjte3k2mDK/t796Y/K/XzTE5iQqhshxB8elHSS4pajqOifaF845y6ADjRrhdKb9mhnF0YVZcCMFvy39Bu1EPDuuNi6PaaaOMY4nskF2yTwJyTE7JOWmSFuHknjySZ/LiPXhP3qv3Nh2d8UrNNvlR3scXcFGhxA== = ✏ ( ˜ C s / + s / ) AAACPXicbZBLSwMxFIUzvq2vqks3wSIoQp1pBd0IQjcuK1gVOnXIZG5rMPMguSOUMH/Mjf/BnTs3LhRx69a0zsLXhZCPc+8lOSfMpNDouo/OxOTU9Mzs3HxlYXFpeaW6unau01xx6PBUpuoyZBqkSKCDAiVcZgpYHEq4CG9ao/7FLSgt0uQMhxn0YjZIRF9whlYKqmd+mMpID2N7UeNHIJEFxi2OfMi0kHakFAvqh2JA97Z9FDIC0yoC4xVUX5km3aONYteSN6adIqjW3Lo7LvoXvBJqpKx2UH3wo5TnMSTIJdO667kZ9gxTKLiEouLnGjLGb9gAuhYTFoPumbH7gm5ZJaL9VNmTIB2r3zcMi/XIoJ2MGV7r372R+F+vm2P/sGdEkuUICf96qJ9LiikdRUkjoYCjHFpgXAn7V8qvmWIcbeAVG4L32/JfOG/UvWa9cbpfO26UccyRDbJJtolHDsgxOSFt0iGc3JEn8kJenXvn2Xlz3r9GJ5xyZ538KOfjE0QQrdE= = ˜ C s ( s mT ) AAACKXicbVDLSgMxFM34tr6qLt0Ei1A3dWYUdCMU3Lis0FqhU4ZM5rYNZh5N7ghlmN9x46+4UVDUrT9iWrvQ6oGQwzn3kpwTpFJotO13a25+YXFpeWW1tLa+sblV3t651kmmOLR4IhN1EzANUsTQQoESblIFLAoktIPbi7HfvgOlRRI3cZRCN2L9WPQEZ2gkv1z3gkSGehSZi+ZeCBKZn9vFuYdChpBfFH7uFtQLRJ8eaeoNMxbSqiF9GNKoeVj45Ypdsyegf4kzJRUyRcMvP3thwrMIYuSSad1x7BS7OVMouISi5GUaUsZvWR86hsYsAt3NJ0kLemCUkPYSZU6MdKL+3MhZpMdhzGTEcKBnvbH4n9fJsHfWzUWcZggx/36ol0mKCR3XRkOhgKMcGcK4EuavlA+YYhxNuSVTgjMb+S+5dmvOcc29OqnU3WkdK2SP7JMqccgpqZNL0iAtwsk9eSQv5NV6sJ6sN+vje3TOmu7skl+wPr8A076mOw== = ✏ ( ˜ C s / + s / ) AAACPXicbZBLSwMxFIUzvq2vqks3wSIoQp1pBd0IQjcuK1gVOnXIZG5rMPMguSOUMH/Mjf/BnTs3LhRx69a0zsLXhZCPc+8lOSfMpNDouo/OxOTU9Mzs3HxlYXFpeaW6unau01xx6PBUpuoyZBqkSKCDAiVcZgpYHEq4CG9ao/7FLSgt0uQMhxn0YjZIRF9whlYKqmd+mMpID2N7UeNHIJEFxi2OfMi0kHakFAvqh2JA97Z9FDIC0yoC4xVUX5km3aONYteSN6adIqjW3Lo7LvoXvBJqpKx2UH3wo5TnMSTIJdO667kZ9gxTKLiEouLnGjLGb9gAuhYTFoPumbH7gm5ZJaL9VNmTIB2r3zcMi/XIoJ2MGV7r372R+F+vm2P/sGdEkuUICf96qJ9LiikdRUkjoYCjHFpgXAn7V8qvmWIcbeAVG4L32/JfOG/UvWa9cbpfO26UccyRDbJJtolHDsgxOSFt0iGc3JEn8kJenXvn2Xlz3r9GJ5xyZ538KOfjE0QQrdE= s ( T ) AAAB63icbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix4MVjhX5Bu5Rsmm1Dk+ySZIWy9C948aCIV/+QN/+N2XYP2vpg4PHeDDPzgpgzbVz32ylsbe/s7hX3SweHR8cn5dOzro4SRWiHRDxS/QBrypmkHcMMp/1YUSwCTnvB7D7ze09UaRbJtpnH1Bd4IlnICDaZpKvt61G54tbcJdAm8XJSgRytUflrOI5IIqg0hGOtB54bGz/FyjDC6aI0TDSNMZnhCR1YKrGg2k+Xty7QlVXGKIyULWnQUv09kWKh9VwEtlNgM9XrXib+5w0SE975KZNxYqgkq0VhwpGJUPY4GjNFieFzSzBRzN6KyBQrTIyNp2RD8NZf3iTdes1r1OqPN5VmPY+jCBdwCVXw4Baa8AAt6ACBKTzDK7w5wnlx3p2PVWvByWfO4Q+czx9GBI2u s ( T ) AAAB63icbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix4MVjhX5Bu5Rsmm1Dk+ySZIWy9C948aCIV/+QN/+N2XYP2vpg4PHeDDPzgpgzbVz32ylsbe/s7hX3SweHR8cn5dOzro4SRWiHRDxS/QBrypmkHcMMp/1YUSwCTnvB7D7ze09UaRbJtpnH1Bd4IlnICDaZpKvt61G54tbcJdAm8XJSgRytUflrOI5IIqg0hGOtB54bGz/FyjDC6aI0TDSNMZnhCR1YKrGg2k+Xty7QlVXGKIyULWnQUv09kWKh9VwEtlNgM9XrXib+5w0SE975KZNxYqgkq0VhwpGJUPY4GjNFieFzSzBRzN6KyBQrTIyNp2RD8NZf3iTdes1r1OqPN5VmPY+jCBdwCVXw4Baa8AAt6ACBKTzDK7w5wnlx3p2PVWvByWfO4Q+czx9GBI2u mT AAAB6XicbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY8FLx6r9AvapWTTbBuaZJckK5Sl/8CLB0W8+o+8+W9M2z1o64OBx3szzMwLE8GN9bxvVNjY3NreKe6W9vYPDo/KxydtE6eashaNRay7ITFMcMVallvBuolmRIaCdcLJ3dzvPDFteKyadpqwQJKR4hGnxDrpUTYH5YpX9RbA68TPSQVyNAblr/4wpqlkylJBjOn5XmKDjGjLqWCzUj81LCF0Qkas56gikpkgW1w6wxdOGeIo1q6UxQv190RGpDFTGbpOSezYrHpz8T+vl9roNsi4SlLLFF0uilKBbYznb+Mh14xaMXWEUM3drZiOiSbUunBKLgR/9eV10q5V/atq7eG6Uq/lcRThDM7hEny4gTrcQwNaQCGCZ3iFNzRBL+gdfSxbCyifOYU/QJ8/druNQw== mT AAAB6XicbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY8FLx6r9AvapWTTbBuaZJckK5Sl/8CLB0W8+o+8+W9M2z1o64OBx3szzMwLE8GN9bxvVNjY3NreKe6W9vYPDo/KxydtE6eashaNRay7ITFMcMVallvBuolmRIaCdcLJ3dzvPDFteKyadpqwQJKR4hGnxDrpUTYH5YpX9RbA68TPSQVyNAblr/4wpqlkylJBjOn5XmKDjGjLqWCzUj81LCF0Qkas56gikpkgW1w6wxdOGeIo1q6UxQv190RGpDFTGbpOSezYrHpz8T+vl9roNsi4SlLLFF0uilKBbYznb+Mh14xaMXWEUM3drZiOiSbUunBKLgR/9eV10q5V/atq7eG6Uq/lcRThDM7hEny4gTrcQwNaQCGCZ3iFNzRBL+gdfSxbCyifOYU/QJ8/druNQw== AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeCF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1Fip6Q3KFbfqLkDWiZeTCuRoDMpf/WHM0gilYYJq3fPcxPgZVYYzgbNSP9WYUDahI+xZKmmE2s8Wh87IhVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGtn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNiUbgrf68jpp16reVbXWvK7Ua3kcRTiDc7gED26gDvfQgBYwQHiGV3hzHp0X5935WLYWnHzmFP7A+fwBdxuMqQ== AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeCF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1Fip6Q3KFbfqLkDWiZeTCuRoDMpf/WHM0gilYYJq3fPcxPgZVYYzgbNSP9WYUDahI+xZKmmE2s8Wh87IhVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGtn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNiUbgrf68jpp16reVbXWvK7Ua3kcRTiDc7gED26gDvfQgBYwQHiGV3hzHp0X5935WLYWnHzmFP7A+fwBdxuMqQ== s ( T ) AAAB63icbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix4MVjhX5Bu5Rsmm1Dk+ySZIWy9C948aCIV/+QN/+N2XYP2vpg4PHeDDPzgpgzbVz32ylsbe/s7hX3SweHR8cn5dOzro4SRWiHRDxS/QBrypmkHcMMp/1YUSwCTnvB7D7ze09UaRbJtpnH1Bd4IlnICDaZpKvt61G54tbcJdAm8XJSgRytUflrOI5IIqg0hGOtB54bGz/FyjDC6aI0TDSNMZnhCR1YKrGg2k+Xty7QlVXGKIyULWnQUv09kWKh9VwEtlNgM9XrXib+5w0SE975KZNxYqgkq0VhwpGJUPY4GjNFieFzSzBRzN6KyBQrTIyNp2RD8NZf3iTdes1r1OqPN5VmPY+jCBdwCVXw4Baa8AAt6ACBKTzDK7w5wnlx3p2PVWvByWfO4Q+czx9GBI2u mT AAAB6XicbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY8FLx6r9AvapWTTbBuaZJckK5Sl/8CLB0W8+o+8+W9M2z1o64OBx3szzMwLE8GN9bxvVNjY3NreKe6W9vYPDo/KxydtE6eashaNRay7ITFMcMVallvBuolmRIaCdcLJ3dzvPDFteKyadpqwQJKR4hGnxDrpUTYH5YpX9RbA68TPSQVyNAblr/4wpqlkylJBjOn5XmKDjGjLqWCzUj81LCF0Qkas56gikpkgW1w6wxdOGeIo1q6UxQv190RGpDFTGbpOSezYrHpz8T+vl9roNsi4SlLLFF0uilKBbYznb+Mh14xaMXWEUM3drZiOiSbUunBKLgR/9eV10q5V/atq7eG6Uq/lcRThDM7hEny4gTrcQwNaQCGCZ3iFNzRBL+gdfSxbCyifOYU/QJ8/druNQw== AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeCF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1Fip6Q3KFbfqLkDWiZeTCuRoDMpf/WHM0gilYYJq3fPcxPgZVYYzgbNSP9WYUDahI+xZKmmE2s8Wh87IhVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGtn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNiUbgrf68jpp16reVbXWvK7Ua3kcRTiDc7gED26gDvfQgBYwQHiGV3hzHp0X5935WLYWnHzmFP7A+fwBdxuMqQ== T AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4sSRV0GPBi8eK/YI2lM120i7dbMLuRiihP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEsG1cd1vZ219Y3Nru7BT3N3bPzgsHR23dJwqhk0Wi1h1AqpRcIlNw43ATqKQRoHAdjC+m/ntJ1Sax7JhJgn6ER1KHnJGjZUeG5dev1R2K+4cZJV4OSlDjnq/9NUbxCyNUBomqNZdz02Mn1FlOBM4LfZSjQllYzrErqWSRqj9bH7qlJxbZUDCWNmShszV3xMZjbSeRIHtjKgZ6WVvJv7ndVMT3voZl0lqULLFojAVxMRk9jcZcIXMiIkllClubyVsRBVlxqZTtCF4yy+vkla14l1Vqg/X5Vo1j6MAp3AGF+DBDdTgHurQBAZDeIZXeHOE8+K8Ox+L1jUnnzmBP3A+fwCFTY0+ T AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4sSRV0GPBi8eK/YI2lM120i7dbMLuRiihP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEsG1cd1vZ219Y3Nru7BT3N3bPzgsHR23dJwqhk0Wi1h1AqpRcIlNw43ATqKQRoHAdjC+m/ntJ1Sax7JhJgn6ER1KHnJGjZUeG5dev1R2K+4cZJV4OSlDjnq/9NUbxCyNUBomqNZdz02Mn1FlOBM4LfZSjQllYzrErqWSRqj9bH7qlJxbZUDCWNmShszV3xMZjbSeRIHtjKgZ6WVvJv7ndVMT3voZl0lqULLFojAVxMRk9jcZcIXMiIkllClubyVsRBVlxqZTtCF4yy+vkla14l1Vqg/X5Vo1j6MAp3AGF+DBDdTgHurQBAZDeIZXeHOE8+K8Ox+L1jUnnzmBP3A+fwCFTY0+ T AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4sSRV0GPBi8eK/YI2lM120i7dbMLuRiihP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEsG1cd1vZ219Y3Nru7BT3N3bPzgsHR23dJwqhk0Wi1h1AqpRcIlNw43ATqKQRoHAdjC+m/ntJ1Sax7JhJgn6ER1KHnJGjZUeG5dev1R2K+4cZJV4OSlDjnq/9NUbxCyNUBomqNZdz02Mn1FlOBM4LfZSjQllYzrErqWSRqj9bH7qlJxbZUDCWNmShszV3xMZjbSeRIHtjKgZ6WVvJv7ndVMT3voZl0lqULLFojAVxMRk9jcZcIXMiIkllClubyVsRBVlxqZTtCF4yy+vkla14l1Vqg/X5Vo1j6MAp3AGF+DBDdTgHurQBAZDeIZXeHOE8+K8Ox+L1jUnnzmBP3A+fwCFTY0+ Case 2: ˜ C ✏ ˜ C + 1  s vol( S ) ⇡ < ✏ AAACi3icbVFNaxsxENVuv1y3adz22IuoKaQUzK5TSAg5BEyhx5TWScAyRqudTUS0K0WaDTVCf6Y/qbf+m2qdPTRxB4Qeb2aeRm8Ko6TDLPuTpI8eP3n6bPB8+OLlzqvd0es3Z063VsBcaKXtRcEdKNnAHCUquDAWeF0oOC+uZ13+/Bask7r5gWsDy5pfNrKSgmOkVqNfDOEnFpWfRRE6PaKBskKr0q3reHlWWS48Q6lK8LOw8nnMg3FSxWbPSlDIQ7hf8KmrUXBDp8zdWOw1KNMGLEdtG16Dv9Uq7LGa45Xgyn8PH6OIkSEcb6uvRuNskm2CboO8B2PSx+lq9JuVWrQ1NCgUd26RZwaXnluUQkEYstaB4eKaX8Iiwm4et/QbLwP9EJmSVtrG0yDdsP92eF67zpxY2U3vHuY68n+5RYvV4dLLxrQIjbh7qGoVRU27xdBSWhCo1hFwYWWclYorHq3DuL5hNCF/+OVtcDad5PuT6bfP45Ost2NA3pH3ZI/k5ICckK/klMyJSAbJJDlIDtOddD89So/vStOk73lL7kX65S+C+8nF Case 1: 2 s vol( S ) ⇡ < ˜ C ✏ ˜ C + 1 AAACd3icbVFLbxMxEPYurxJeAW5wwCKAgipFuylSK8ShUi4ciyBtpTiKvN7Z1qp37dqzVSPLf6E/jhv/gws3vEkOtGUky59mvnl9UxglHWbZryS9c/fe/QdbD3uPHj95+qz//MWh060VMBVaaXtccAdKNjBFiQqOjQVeFwqOirNJFz+6AOukbn7g0sC85ieNrKTgGF2L/hVDuMSi8pNYhOafaaCs0Kp0yzp+fszcuUXPKsuFp0wbsBy1bXgN/kKrMGQ1x1PBlf8ePgbPjAyBfqFrPkOpSvCTsPB5LAvGSRV7elaCQh7CdcJ25IRFf5CNspXR2yDfgAHZ2MGi/5OVWrQ1NCgUd26WZwbnnluUQkHosdaB4eKMn8Aswm5wN/cr3QJ9Hz0lrbSNr0G68v6b4XntOiEis1vT3Yx1zv/FZi1We3MvG9MiNGLdqGoVRU27I9BSWhColhFwYWWclYpTHjXDeKpeFCG/ufJtcDge5Tuj8bdPg/1sI8cWeU3ekiHJyS7ZJ1/JAZkSQX4nr5JB8i75k75JP6TDNTVNNjkvyTVL879GncJp Case 3: 2 s vol( S ) ⇡ ✏ AAACUXicbVFNTxsxEHUW2tLQj5Qee7EaIdFLtBsqteoJiUuPVBBAiqPI651NLLxrY8+iRpb/Igd64n9w4UBVb8iBj45k+enNjGfec26UdJim151kbf3Fy1cbr7ubb96+e9/7sHXsdGMFjIRW2p7m3IGSNYxQooJTY4FXuYKT/Gy/zZ9cgHVS10e4MDCp+KyWpRQcIzXtzRnCb8xLvx8fobs/aKAs16pwiypefsjcuUXPSsuFp0wbsBy1rXkF/kKrsMMqjnPBlT8MX4JnRobAZnDOwDip4gDPClDIQ5j2+ukgXQZ9DrIV6JNVHEx7V6zQoqmgRqG4c+MsNTjx3KIUCkKXNQ4MF2d8BuMI25XcxC8dCXQ7MgUttY2nRrpkH3Z4XrlWYqxsBbinuZb8X27cYPl94mVtGoRa3A8qG0VR09ZeWkgLAtUiAi6sjLtSMefRPYyf0I0mZE8lPwfHw0G2Oxj++trfS1d2bJBP5DPZIRn5RvbIT3JARkSQS3JD7sjfzp/ObUKS5L406ax6PpJHkWz+A16xtyI= = ( ✏ p vol( S ) /⇡ ) /s / AAACT3icbVFNbxMxEPWGrzZ8NJQjF4sIKRwIuwEJLkiVeuFYBGkrxSGa9c62Vr1rY89Wiqz9h73QG3+DCwcQwhv2AC0jWfP03lh+85xbrTyl6ddkcOPmrdt3traHd+/df7Azerh76E3jJM6l0cYd5+BRqxrnpEjjsXUIVa7xKD/b7/Sjc3RemfojrS0uKzipVakkUKRWo1LkRhd+XcUWRIGaYBXS9u1EoPVKx5GebZ/PhP/sKHBhLDog42qoMJwb3U5EBXQqQYcP7TP+ggurNt1/Cllss7ZdjcbpNN0Uvw6yHoxZXwer0aUojGwqrElq8H6RpZaWARwpqbEdisajBXkGJ7iIsLPil2GTR8ufRqbgpXHx1MQ37N83AlS+WzlOdsb9Va0j/6ctGirfLIOqbUNYyz8PlY3mZHgXLi+UQ0l6HQFIp6JXLk/BgaT4BcMYQnZ15evgcDbNXk5n71+N9yZ9HFvsMXvCJixjr9kee8cO2JxJdsG+sR/sZ/Il+Z78GvSjg6QHj9g/Ndj+DZjktHs= AAAB8XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lqQY8FLx4r2A9sQ9lspu3SzSbsboQS+i+8eFDEq//Gm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ7dzvPKHSPJYPZpqgH9GR5EPOqLHSY9YPURg6cGeDcsWtuguQdeLlpAI5moPyVz+MWRqhNExQrXuemxg/o8pwJnBW6qcaE8omdIQ9SyWNUPvZ4uIZubBKSIaxsiUNWai/JzIaaT2NAtsZUTPWq95c/M/rpWZ442dcJqlByZaLhqkgJibz90nIFTIjppZQpri9lbAxVZQZG1LJhuCtvrxO2rWqd1Wt3dcrjXoeRxHO4BwuwYNraMAdNKEFDCQ8wyu8Odp5cd6dj2VrwclnTuEPnM8ff6SQwQ== AAAB8XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lqQY8FLx4r2A9sQ9lspu3SzSbsboQS+i+8eFDEq//Gm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ7dzvPKHSPJYPZpqgH9GR5EPOqLHSY9YPURg6cGeDcsWtuguQdeLlpAI5moPyVz+MWRqhNExQrXuemxg/o8pwJnBW6qcaE8omdIQ9SyWNUPvZ4uIZubBKSIaxsiUNWai/JzIaaT2NAtsZUTPWq95c/M/rpWZ442dcJqlByZaLhqkgJibz90nIFTIjppZQpri9lbAxVZQZG1LJhuCtvrxO2rWqd1Wt3dcrjXoeRxHO4BwuwYNraMAdNKEFDCQ8wyu8Odp5cd6dj2VrwclnTuEPnM8ff6SQwQ== AAAB8XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lqQY8FLx4r2A9sQ9lspu3SzSbsboQS+i+8eFDEq//Gm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ7dzvPKHSPJYPZpqgH9GR5EPOqLHSY9YPURg6cGeDcsWtuguQdeLlpAI5moPyVz+MWRqhNExQrXuemxg/o8pwJnBW6qcaE8omdIQ9SyWNUPvZ4uIZubBKSIaxsiUNWai/JzIaaT2NAtsZUTPWq95c/M/rpWZ442dcJqlByZaLhqkgJibz90nIFTIjppZQpri9lbAxVZQZG1LJhuCtvrxO2rWqd1Wt3dcrjXoeRxHO4BwuwYNraMAdNKEFDCQ8wyu8Odp5cd6dj2VrwclnTuEPnM8ff6SQwQ== Figure 2: Illustration of ε -recoverable region for ﬁxed T . This region becomes smaller whenhypothesis class S grows larger. The expressions for the curves in case 1 are the same withcase 2. The upper bound (8) will be nearly tight in the region shaded in red dashed lines. Denote the observation x i : j = ( x i , . . . , x j ) T , where the superscript T denotes vector/matrixtranspose. Given x T and known history x , we aim to estimate coeﬃcient vector β = α , µ, ∆ , . . . , ∆ T ) T ∈ R T +1 , where µ = f , ∆ i = f i − f i − , i = 2 , . . . , T . Further denotethe random noise vector by ε T = ( ε , . . . , ε T ) T , and the random design matrix by X =( x T − , L ) ∈ R T × ( T +1) , where L ∈ R T × T is the lower triangular matrix with non-zeros entriesall being ones. For notational simplicity, we rewrite (1) as x T = X β + ε T . (5)Denote ∆ = (0 , , ∆ , . . . , ∆ T ) T and (cid:107) · (cid:107) q to be (cid:96) q vector norm. Here, we slightly abusethe notation and assume β (instead of ∆ ) has at most s non-zero entries. Since the totalvariation constraint only encourages a sparse structure on ∆ , ∆ will have at most s − non-zero entries: B = { ∆ : (cid:107) ∆ (cid:107) = s − ∈ { , . . . , T − } , (cid:107) ∆ (cid:107) ∞ ≤ δ } . The space for the unknown true coeﬃcient vector is deﬁned as Θ T = { β : ( α , µ ) ∈ S (cid:48) , ∆ ∈ B} . (6)The hypothesis class X , i.e. the set where we want to estimate the coeﬃcients, is deﬁned as X = { β : ( α , µ ) ∈ S , (cid:107) ∆ (cid:107) < δ } , where S = { ( α , µ ) : α + µ ≤ δ s } . Here, δ s is a user-speciﬁed parameter, which speciﬁes the size of hypothesis class X . Weassume X contain the ground truth, i.e., S (cid:48) ⊂ S .Our goal is to estimate the unknown β ∈ Θ T by (cid:98) β T ∈ X by solving the following convexoptimization problem: (cid:98) β T = arg min β ∈X T (cid:107) x T − X β (cid:107) . (7)Note that this is slightly diﬀerent from (2) (which only has constraint (cid:107) ∆ (cid:107) < δ ). However,since δ s is typically set to be suﬃciently large such that the solution will not occur on theboundary of S , (7) leads to the same estimate as (2). Here, we start with a non-asymptotic upper bound for the (cid:96) recovery error for the modelcoeﬃcients and derive the condition for recoverable sequences in (4). Theorem 1 (Upper Bound on (cid:96) estimation error) . For (cid:98) β T deﬁned by (7), for any A > , A > √ A and A > , and for any selected hyperparameter δ , with probability at least − (2 T ) − A − (2 T ) − A /A − T ) − A /A , we have (cid:107) (cid:98) β T − β (cid:107) ≤ min (cid:110) (cid:101) C √ s max { sδ , δ } , (cid:112) vol( S ) /π (cid:111) + δ + √ sδ , (8)where (cid:101) C is a positive constant dependent on A , A and A .Note that (8) implies smaller s and δ will lead to smaller error. To ensure the upperbound is less than a pre-speciﬁed ε > , we choose δ at most ε/ (1 + (cid:101) C √ T ) , which leads tothe condition for ε -recoverable sequences in (4).7 .4 Lower bound Now we present a lower bound for the (cid:96) recovery error using triangle inequality and thenimprove it via Fano’s method.A naive lower bound can be derived by triangle inequality: (cid:107) (cid:98) β T − β (cid:107) ≥ (cid:107) (cid:98) ∆ − ∆ (cid:107) ≥ (cid:107) ∆ (cid:107) − (cid:107) (cid:98) ∆ (cid:107) ≥ (cid:107) ∆ (cid:107) − δ, where (cid:107) (cid:98) ∆ (cid:107) ≤ δ due to the total variation constraint. However, since we usually do not know (cid:107) ∆ (cid:107) = O ( √ sδ ) a priori , we cannot set δ close to (cid:107) ∆ (cid:107) to make sure the best achievableperformance. Besides, to control the worst performance (i.e. upper bound), δ is O ( ε/ √ T ) .Therefore, for series x T with a large (cid:107) ∆ (cid:107) , we should expect that the recovery error is Θ( √ sδ ) . However, this type of series is typically outside the recoverable region. We aremore interested in the lower bound for those instances with much smaller (cid:107) ∆ (cid:107) . For a certaintype of series (which satisﬁes assumptions (9) and (10) below), we can obtain a tighter lowerbound by Fano’s method as follows: Theorem 2 (Lower bound on (cid:96) estimation error) . If there exist (cid:101) C > and < m < M ≤ such that s ( t ) ∈ [ mt, M t ] , t = 1 , . . . , T , (9) s ( t ) δ ( t ) ≤ (cid:101) C , t = 1 , . . . , T , (10)then for any C ∈ (0 , and for any estimator (cid:101) β T , we have sup β ∈ Θ T pr (cid:16) (cid:107) (cid:101) β T − β (cid:107) ≥ C (cid:17) ≥ − C , (11)and C ≥

12 exp (cid:40) − C + C (cid:101) C M + C (cid:101) C M + (log 2) /TC m (cid:41) , (12)where C , C and C are some positive constants only dependent on δ s .Under assumptions (9) and (10), the naive lower bound will be of order O (1 / √ T ) andtherefore the lower bound C (constant order) will be tighter. We denote it by C lower . Besides,(8) ensures the upper bound will be at most (cid:112) vol( S ) /π plus a O (1 / √ T ) term. Thus, wecan also obtain a constant order upper bound C upper . In this special case, the (cid:96) estimationerror will stay within a constant order interval (cid:107) (cid:98) β T − β (cid:107) ∈ [ C lower , C upper ] with constantprobability for t = 1 , . . . , T . We illustrate this constant order interval in the region shadedin green in Figure 3.Figure 3 shows that the upper bound stays close to Fano’s lower bound on a ﬁnite timehorizon, which demonstrates the near-optimality of the proposed method. Besides, the greenregion in Figure 3 illustrates the constant probability estimation error trajectories for thoseinstances within the red dashed region in Figure 2.8 Time horizon t

Lower bound (Naive)Upper boundLower bound (Fano’s) T AAAB+HicbVDLSsNAFL3xWeujUZduBovgqiRV0GXBjcsKfUEbwmQyaYdOJmFmItTQL3HjQhG3foo7/8ZJm4W2HhjmcM69zJkTpJwp7Tjf1sbm1vbObmWvun9weFSzj096KskkoV2S8EQOAqwoZ4J2NdOcDlJJcRxw2g+md4Xff6RSsUR09CylXozHgkWMYG0k366NgoSHahabC3V8x7frTsNZAK0TtyR1KNH27a9RmJAspkITjpUauk6qvRxLzQin8+ooUzTFZIrHdGiowDFVXr4IPkcXRglRlEhzhEYL9fdGjmNVZDOTMdYTteoV4n/eMNPRrZczkWaaCrJ8KMo40gkqWkAhk5RoPjMEE8lMVkQmWGKiTVdVU4K7+uV10ms23KtG8+G63mqWdVTgDM7hEly4gRbcQxu6QCCDZ3iFN+vJerHerY/l6IZV7pzCH1ifP0vVkss= l og k ˆ T k AAACFXicbVDLSgMxFM3UV62vqks3wSK40DJTBbssuHFZoS/oDEMmTdvQzGRI7ghl2p9w46+4caGIW8Gdf2PazkJbD4QczrmXe+8JYsE12Pa3lVtb39jcym8Xdnb39g+Kh0ctLRNFWZNKIVUnIJoJHrEmcBCsEytGwkCwdjC6nfntB6Y0l1EDxjHzQjKIeJ9TAkbyixduIEVPj0Pzpa6QA+xO3CEB7AYMiN/AlwvmTvzK1C+W7LI9B14lTkZKKEPdL365PUmTkEVABdG669gxeClRwKlg04KbaBYTOiID1jU0IiHTXjq/aorPjNLDfanMiwDP1d8dKQn1bHFTGRIY6mVvJv7ndRPoV72UR3ECLKKLQf1EYJB4FhHuccUoiLEhhCpudsV0SBShYIIsmBCc5ZNXSatSdq7KlfvrUq2axZFHJ+gUnSMH3aAaukN11EQUPaJn9IrerCfrxXq3PhalOSvrOUZ/YH3+AA96nrA= log C upper AAACAnicbVDLSsNAFJ3UV62vqCtxM1gEVyWpgi4L3bisYB/QhDCZTNqhM5kwMxFKCG78FTcuFHHrV7jzb5y2WWjrgQuHc+7l3nvClFGlHefbqqytb2xuVbdrO7t7+wf24VFPiUxi0sWCCTkIkSKMJqSrqWZkkEqCeMhIP5y0Z37/gUhFRXKvpynxORolNKYYaSMF9okXChblHhMj2A5yT3KYpSmRRRHYdafhzAFXiVuSOijRCewvLxI44yTRmCGlhq6Taj9HUlPMSFHzMkVShCdoRIaGJogT5efzFwp4bpQIxkKaSjScq78ncsSVmvLQdHKkx2rZm4n/ecNMxzd+TpM00yTBi0VxxqAWcJYHjKgkWLOpIQhLam6FeIwkwtqkVjMhuMsvr5Jes+FeNpp3V/VWs4yjCk7BGbgALrgGLXALOqALMHgEz+AVvFlP1ov1bn0sWitWOXMM/sD6/AF3HZdt log C lower AAACAnicbVDLSsNAFJ34rPUVdSVuBovgqiRV0GWhG5cV7AOaECaTSTt0JhNmJkoJwY2/4saFIm79Cnf+jdM2C209cOFwzr3ce0+YMqq043xbK6tr6xubla3q9s7u3r59cNhVIpOYdLBgQvZDpAijCeloqhnpp5IgHjLSC8etqd+7J1JRkdzpSUp8joYJjSlG2kiBfeyFgkW5x8QQtoLckxwy8UBkUQR2zak7M8Bl4pakBkq0A/vLiwTOOEk0Zkipgeuk2s+R1BQzUlS9TJEU4TEakoGhCeJE+fnshQKeGSWCsZCmEg1n6u+JHHGlJjw0nRzpkVr0puJ/3iDT8bWf0yTNNEnwfFGcMagFnOYBIyoJ1mxiCMKSmlshHiGJsDapVU0I7uLLy6TbqLsX9cbtZa3ZKOOogBNwCs6BC65AE9yANugADB7BM3gFb9aT9WK9Wx/z1hWrnDkCf2B9/gBycpdq Figure 3: Illustration of the naive lower bound the bound derived by Fano’s method for (cid:96) error under assumptions (9) and (10). The trajectories of (cid:96) estimation error for thoseinstances covered by the red dashed lines in Figure 2 stay within the region shaded in green. We now present the proof idea for the main theorems. Detailed proofs for Theorems 1 and 2as well as Propositions 1 and 2 are deferred to Appendix C.The proof of Theorem 1 is largely based on

Restricted Eigenvalue condition and

VariationalInequality . Consider the penalized form of (7) (cid:98) β T = arg min β ∈ R T +1 T (cid:107) x T − X β (cid:107) + λ (cid:107) ∆ (cid:107) , (13)where λ is the tuning parameter. By Lagrangian duality, we can show that (13) is equivalentto (7). It is known that [Wainwright, 2019] there is a one-to-one correspondence between δ and λ : if (cid:98) β T = (cid:98) β T ( λ ) minimizes (13), then it also minimizes (7) with δ = (cid:107) (cid:98) ∆ (cid:107) .The formulation (13) links our problem to high-dimensional lasso [Wainwright, 2019].This connection motivates us to invoke restricted eigenvalue condition due to Bickel et al.[2009], Van De Geer and Bühlmann [2009] for the design matrix to bound (cid:96) estimation error,since the restricted eigenvalue condition is the weakest known suﬃcient condition accordingto Raskutti et al. [2010]. Although there have been works verifying restricted eigenvalueconditions for x T − [Loh and Wainwright, 2011, Basu and Michailidis, 2015, Wu and Wu,2016] or L [Harchaoui and Lévy-Leduc, 2010], they cannot be directly applied for our settinghere. Speciﬁcally, expanding x T − with a square matrix L in the design matrix leads tothe rank-deﬁciency of X T X , thus simply exploring the structure of x T − cannot address theproblem.Partition the index set { , . . . , T + 1 } into three disjoint parts I i ( i = 1 , , , where I = { , } , I is the indices for non-zero ∆ i ’s and I is the indices for zeros in β . By using anindex set I as the subscript of a vector, we keep all entries with indices from I intact and zeroout entries with indices from its complement I c . Since the (cid:96) constraint does not encouragesparsity on α and µ , we modify the deﬁnition of the restricted eigenvalues as follows φ min ( u ) = min e ∈ R (cid:107) X e (cid:107) √ T (cid:107) e (cid:107) , φ max ( u ) = max e ∈ R (cid:107) X e (cid:107) √ T (cid:107) e (cid:107) , (14)where R = { e : 2 = (cid:107) e I (cid:107) ≤ (cid:107) e (cid:107) ≤ u } and R = { e : 1 ≤ (cid:107) e (cid:107) ≤ u, (cid:107) e I (cid:107) = 0 } .9 emark 1. The smallest restricted eigenvalue can be understood as follows: take columns of X indexed by 1, 2 and another u − indices from I c to form a new matrix (cid:101) X , then φ min ( u ) isthe smallest one among eigenvalues of all possible (cid:101) X T (cid:101) X /T ’s. Similarly, φ max ( u ) is the largesteigenvalue of (cid:101) X T (cid:101) X /T , where (cid:101) X is composed of u columns of X with indices chosen from I c .In the following analysis, we use a recently developed technique based on variationalinequality [Juditsky and Nemirovski, 2019, Juditsky et al., 2020] to establish the upper bound.Consider the gradient ﬁeld of the objective function in (7): F x T ( z ) = (cid:0) A [ x T ] z − a [ x T ] (cid:1) /T, where A [ x T ] = X T X and a [ x T ] = X T ( (cid:80) Ti =1 x i x i − , x T T ) T . This vector ﬁled is aﬃne andmonotone, since we can verify the symmetric matrix A [ x T ] is positive semi-deﬁnite. Theminimizer of (7), i.e. (cid:98) β T , is in fact the solution to the following variational inequality:ﬁnd z ∈ X : (cid:104) F x T ( w ) , w − z (cid:105) ≥ , ∀ w ∈ X . VI [ F x T , X ] Moreover, β is zero of (cid:101) F x T ( z ) and solution to the following variational inequality:ﬁnd z ∈ X : (cid:104) (cid:101) F x T ( w ) , w − z (cid:105) ≥ , ∀ w ∈ X , VI [ (cid:101) F x T , X ] where (cid:101) F x T ( z ) = (cid:0) A [ x T ] z − A [ x T ] β (cid:1) /T. We can see that F x T ( z ) and (cid:101) F x T ( z ) only diﬀer in the following constant term: η = F x T ( β ) − (cid:101) F x T ( β ) = F x T ( β ) = ( A [ x T ] β − a [ x T ]) /T. Intuitively, the diﬀerence between VI [ F x T , X ] and VI [ (cid:101) F x T , X ] should reﬂect the diﬀerencebetween the solutions to those two variational inequalities, i.e. our estimator (cid:98) β T and theground truth β . We will show how to bound the (cid:96) estimation error using (cid:107) η (cid:107) ∞ in thefollowing theorem.The following proposition establishes the error bound for the auto-correlation coeﬃcientand the initial dynamic coeﬃcient, combined. Proposition 1 (Upper Bound on (cid:96) estimation error for α and µ ) . For (cid:98) β T deﬁned by(7), for A , A , A and tuning parameter δ in Theorem 1, there exists k ∈ (0 , such that kλ = O (cid:0) (log T ) / /T / (cid:1) , with probability at least − (2 T ) − A − (2 T ) − A /A − T ) − A /A ,we have (cid:112) ( (cid:98) α − α ) + ( (cid:98) µ − µ ) ≤ max (cid:40) √ κ (cid:18) (cid:107) η (cid:107) ∞ − k + κC δ,δ ,s (cid:19) , δ + sδ (cid:41) , (15)where δ is the magnitude for one-step changes of dynamic background in (3) and κ = (cid:112) φ min (2) − k (cid:112) T − − k ) √ T , C δ,δ ,s = 2 sδ √ T − − k ) √ T + (cid:114) ( s − T − s + 1) T ( sδ + δ ) . Here, φ min ( · ) is deﬁned in (14). Moreover, we have (cid:107) η (cid:107) ∞ ≤ C (log T ) / /T / , where C = C ( A , A , A ) is a positive constant.10ext, we establish the lower bound of the (cid:96) estimation error using Fano’s method. Proposition 2 (Lower bound on (cid:96) estimation error) . For any estimator (cid:101) β T and constant C > , we have sup β ∈ Θ T pr (cid:16) (cid:107) (cid:101) β T − β (cid:107) ≥ C (cid:17) ≥ − C T + C δ ( T ) (cid:80) Tt =2 s ( t ) + C δ ( T ) (cid:80) Tt =2 s ( t ) + log 2 s ( T ) log(1 / C ) , where C , C and C are positive constants only dependent on δ s .We can show that Theorem 2 follows from the above proposition. The key steps in provingthis above proposition are to (i) ﬁnd a large enough ε -packing of Θ T and (ii) upper boundthe Kullback–Leibler (KL) divergence over this packing. ar ( p ) sequences So far we have been focused on analysis for ar (1) sequences; now we discuss how to extendto general cases. For the ar ( p ) case, we need to change several terms in (5) (deﬁnedby ar (1) ): the design matrix becomes X = ( x T − , . . . , x − p +1: T − p , L ) ∈ R T × ( T + p ) , where L ∈ R T × T remains the lower triangular matrix of ones; the coeﬃcient vector becomes β = ( α T p , µ, ∆ , . . . , ∆ T ) T , where α p = ( α , . . . , α p ) T . We can solve a similar convexoptimization problem as that deﬁned in (7) to estimate the parameters, except that thehypothesis class X is deﬁned diﬀerently X = { β : ( α T p , µ ) ∈ S p , (cid:107) ∆ (cid:107) < δ } , where S p = { ( α T p , µ ) : (cid:107) α p (cid:107) + µ ≤ δ p +1 s } . Moreover, we will redeﬁne I = { , . . . , p + 1 } , whilethe deﬁnitions for I and I remain the same as deﬁned in Section 3.5. The restrictedeigenvalues are also deﬁned as (14), except that the error e are restricted to be in R = { e : p + 1 = (cid:107) e I (cid:107) ≤ (cid:107) e (cid:107) ≤ u } when calculating φ min ( u ) . With these deﬁnitions, we can showthe following upper bound for the (cid:96) recovery error: Theorem 3 (Upper Bound on (cid:96) estimation error for ar ( p ) case) . For (cid:98) β T deﬁned by (7) andfor all A > , A > √ A and A > , for any selected tuning parameter δ , with probabilityat least − (2 T ) − A − (2 T ) − A /A − p (2 T ) − A /A , we have (cid:107) (cid:98) β T − β (cid:107) ≤ min (cid:40) (cid:101) C √ s max { sδ , δ } , (cid:115) Γ (cid:18) p + 32 (cid:19) vol( S p ) /π p +12 (cid:41) + δ + √ sδ , (16)where (cid:101) C is a positive constant dependent on A , A and A and Γ( · ) is the gamma function.Since the expression (3) for the upper bound for ar ( p ) case is similar to that in Theorem1, the discussion on ε -recoverable region, which is solely determined by the upper bound ofestimation error, will be similar too. For lower bounding the estimation error via Fano’smethod, we can use similar proof strategy as that in Proposition 2 or Lemma 6 (althoughthe details are more tedious to specify): (i) express x t with respect to β, ε t ; (ii) derive thejoint distribution of x T based on that expression and (iii) bound the KL divergence.11 Numerical experiments

In this section, we perform comprehensive numerical simulations to (i) show that our proposedmethod works well in practice; (ii) validate our theoretical ﬁndings regarding algorithmperformance; (iii) compare with existing methods; (iv) demonstrate the good performance ofthe two proposed bootstrap methods. Recall that our work’s primary focus is to estimate theautoregressive coeﬃcients. Thus, we will focus on this in the following four experiments.

Experiment 1.

First, we show that our proposed estimation method can accuratelyrecover α from non-stationary ar (1) time series under various settings: α ∈ { . , . } , σ ∈ { . , . } δ ∈ { . , . } and T = 5000 . The dynamic background is generated by f i = (cid:80) ik =1 δ ( U k − . , i = 1 , . . . , T , where U k ∈ [0 , , k = 1 , . . . , T is a sequence of i.i.d.uniform random numbers. As discussed above, the accuracy depends on both s and δ . Here,we consider an extreme case: s = T + 1 (which is supposed to be the most challengingcase). Moreover, we also present the results with δ selected by Durbin-Watson test [Durbinand Watson, 1992] as an alternative. (Details on the Durbin-Watson test can be found inSection B.1 in the Appendix B.) The convex program (2) is solved by the cvx package [Grantand Boyd, 2014] and we tune the hyperparameter δ by Ljung-Box test and Durbin-Watsontest, respectively. We repeat the experiment 20 times for each setting, and plot the meansquare error of (cid:98) α , p -value of Ljung-Box test and Durbin-Watson test with diﬀerent δ ’s inFigure 4.The results in Figure 4 show that (cid:98) α decreases when δ increases, and there is a speciﬁcvalue of δ that leads to the smallest mean square error for estimating α . In the ﬁgure, thered dots in the ﬁrst two rows correspond to the best achievable δ ’s in mean square error. Inthe last two rows, those red dots indicate the δ ’s selected by our proposed tuning procedure.Thus, we can observe that (i) the best achievable δ ’s regarding the accuracy and mean squareerror are roughly the same; (ii) our proposed tuning procedure based on both the Ljung-Boxtest and Durbin-Watson test can select the best δ .Table 1: Summary of the information of red dots in Figure 4. Average (standard deviation) mean square error

Setting ( α , δ , σ ) ε -optimal Ljung-Box Durbin-Watson ε -optimal Ljung-Box Durbin-Watson(0.05, 0.05, 0.10) 4.13 × − (2.33 × − ) 4.13 × − (2.33 × − ) 4.13 × − (2.33 × − ) 6.19 × − × − × − (0.05, 0.05, 0.20) 3.12 × − (1.60 × − ) 3.12 × − (1.60 × − ) 3.12 × − (1.60 × − ) 6.09 × − × − × − (0.05, 0.10, 0.10) 3.91 × − (2.03 × − ) 3.91 × − (2.03 × − ) 5.04 × − (2.60 × − ) 5.33 × − × − × − (0.05, 0.10, 0.20) 3.69 × − (2.02 × − ) 3.69 × − (2.02 × − ) 4.64 × − (2.44 × − ) 5.81 × − × − × − (0.10, 0.05, 0.10) 8.47 × − (2.02 × − ) 8.47 × − (2.02 × − ) 8.47 × − (2.02 × − ) 6.42 × − × − × − (0.10, 0.05, 0.20) 8.01 × − (1.65 × − ) 8.01 × − (1.65 × − ) 8.01 × − (1.65 × − ) 6.68 × − × − × − (0.10, 0.10, 0.10) 9.21 × − (2.66 × − ) 8.14 × − (2.41 × − ) 8.14 × − (2.41 × − ) 7.68 × − × − × − (0.10, 0.10, 0.20) 7.83 × − (2.64 × − ) 8.64 × − (3.21 × − ) 8.64 × − (3.21 × − ) 1.17 × − × − × − Table 1 summarizes the optimal and the selected δ ’s (corresponding to the red dots) inFigure 4: (i) the average and the standard deviation of (cid:98) α obtained by ε -optimal (in thesense of accuracy) δ , δ selected by Ljung-Box test and Durbin-Watson test and (ii) meansquare error of (cid:98) α obtained by ε -optimal (in the sense of mean square error) δ , δ selected byLjung-Box test and Durbin-Watson test. 12 = 0.1, = 0.10 0.005 0.01 0.015 0.02-8-7-6-5-4 l og M SE p va l u e p va l u e = 0.05, = 0.050 0.005 0.01 0.015 0.02-8-7-6-5-4 l og M SE p va l u e p va l u e = 0.1, = 0.050 0.005 0.01 0.015 0.02-8-7-6-5-4 l og M SE p va l u e p va l u e = 0.05, = 0.10 0.005 0.01 0.015 0.02-8-7-6-5-4 l og M SE p va l u e p va l u e ˆ ↵ AAACA3icbVDLSsNAFJ34rPUVdaebwSK4KkkVdFlw47KCfUATws1k0gydPJiZCCUU3Pgrblwo4tafcOffOGmz0NYDwxzOuZd77/EzzqSyrG9jZXVtfWOztlXf3tnd2zcPDnsyzQWhXZLyVAx8kJSzhHYVU5wOMkEh9jnt++Ob0u8/UCFZmtyrSUbdGEYJCxkBpSXPPHb8lAdyEuuvcCJQ2AGeReDZU89sWE1rBrxM7Io0UIWOZ345QUrymCaKcJByaFuZcgsQihFOp3UnlzQDMoYRHWqaQEylW8xumOIzrQQ4TIV+icIz9XdHAbEs19SVMahILnql+J83zFV47RYsyXJFEzIfFOYcqxSXgeCACUoUn2gCRDC9KyYRCCBKx1bXIdiLJy+TXqtpXzRbd5eNdquKo4ZO0Ck6Rza6Qm10izqoiwh6RM/oFb0ZT8aL8W58zEtXjKrnCP2B8fkDQJOX2Q== ↵ = 0 . , = 0 . AAACGXicbVDLSsNAFJ34rPUVdelmsAgupCRV0Y1QcOOygn1AE8LNZNIOnTyYmQgl9Dfc+CtuXCjiUlf+jZM2C209MNzDOfcy9x4/5Uwqy/o2lpZXVtfWKxvVza3tnV1zb78jk0wQ2iYJT0TPB0k5i2lbMcVpLxUUIp/Trj+6KfzuAxWSJfG9GqfUjWAQs5ARUFryTMvxEx7IcaRL7gBPh+DZ+BpbdeviFDsYOwHlCjyr1CaeWdN1CrxI7JLUUImWZ346QUKyiMaKcJCyb1upcnMQihFOJ1UnkzQFMoIB7WsaQ0Slm08vm+BjrQQ4TIR+scJT9fdEDpEsltedEaihnPcK8T+vn6nwys1ZnGaKxmT2UZhxrBJcxIQDJihRfKwJEMH0rpgMQQBROsyqDsGeP3mRdBp1+6zeuDuvNa0yjgo6REfoBNnoEjXRLWqhNiLoET2jV/RmPBkvxrvxMWtdMsqZA/QHxtcPfpGdVw== ↵ = 0 . , = 0 . AAACGHicbVDLSsNAFJ3UV62vqks3g0VwITWpim6EghuXFewDmhBuJpN26OTBzEQooZ/hxl9x40IRt935N07TLLT1wHAP59zL3Hu8hDOpTPPbKK2srq1vlDcrW9s7u3vV/YOOjFNBaJvEPBY9DyTlLKJtxRSnvURQCD1Ou97obuZ3n6iQLI4e1TihTgiDiAWMgNKSWz23vZj7chzqktnAkyG4Fr7FZt06wzbGtk+5AtfMJfNq4lZruubAy8QqSA0VaLnVqe3HJA1ppAgHKfuWmSgnA6EY4XRSsVNJEyAjGNC+phGEVDpZftgEn2jFx0Es9IsUztXfExmEcra77gxBDeWiNxP/8/qpCm6cjEVJqmhE5h8FKccqxrOUsM8EJYqPNQEimN4VkyEIIEpnWdEhWIsnL5NOo25d1BsPl7WmWcRRRkfoGJ0iC12jJrpHLdRGBD2jV/SOPowX4834NL7mrSWjmDlEf2BMfwD9BJ0Z ↵ = 0 . , = 0 . AAACF3icbVDLSsNAFJ3UV62vqEs3g0VwISGpgm6EghuXFewDmhBuJtN26OTBzEQooX/hxl9x40IRt7rzb5y0WWjrgWEO59zLvfcEKWdS2fa3UVlZXVvfqG7WtrZ3dvfM/YOOTDJBaJskPBG9ACTlLKZtxRSnvVRQiAJOu8H4pvC7D1RIlsT3apJSL4JhzAaMgNKSb1pukPBQTiL95S7wdAS+g6+xbTln2MXYDSlX4NtzaeqbdduyZ8DLxClJHZVo+eaXGyYki2isCAcp+46dKi8HoRjhdFpzM0lTIGMY0r6mMURUevnsrik+0UqIB4nQL1Z4pv7uyCGSxeq6MgI1koteIf7n9TM1uPJyFqeZojGZDxpkHKsEFyHhkAlKFJ9oAkQwvSsmIxBAlI6ypkNwFk9eJp2G5ZxbjbuLetMu46iiI3SMTpGDLlET3aIWaiOCHtEzekVvxpPxYrwbH/PSilH2HKI/MD5/AHqvnNs= ↵ = 0 . , = 0 . AAACGHicbVDLSsNAFJ3UV62vqks3g0VwITWpim6EghuXFewDmhBuJpN26OTBzEQooZ/hxl9x40IRt935N07TLLT1wDCHc+7l3nu8hDOpTPPbKK2srq1vlDcrW9s7u3vV/YOOjFNBaJvEPBY9DyTlLKJtxRSnvURQCD1Ou97obuZ3n6iQLI4e1TihTgiDiAWMgNKSWz23vZj7chzqL7OBJ0NwLXyLzbp5dYZtjG2fcgWumWvWxK3WtJUDLxOrIDVUoOVWp7YfkzSkkSIcpOxbZqKcDIRihNNJxU4lTYCMYED7mkYQUulk+WETfKIVHwex0C9SOFd/d2QQytnuujIENZSL3kz8z+unKrhxMhYlqaIRmQ8KUo5VjGcpYZ8JShQfawJEML0rJkMQQJTOsqJDsBZPXiadRt26qDceLmtNs4ijjI7QMTpFFrpGTXSPWqiNCHpGr+gdfRgvxpvxaXzNS0tG0XOI/sCY/gD7950Z l og M S E AAACCHicbVDLSsNAFJ34rPVVdenCwSK4KkkVdFkQwY1Q0T6gCWUymbRDJ5kwcyOW0KUbf8WNC0Xc+gnu/BsnbRfaemCYwzn3cu89fiK4Btv+thYWl5ZXVgtrxfWNza3t0s5uU8tUUdagUkjV9olmgsesARwEayeKkcgXrOUPLnK/dc+U5jK+g2HCvIj0Yh5ySsBI3dKB60sR6GFkvswVsjfCLrAH8MMMX99ejrqlsl2xx8DzxJmSMpqi3i19uYGkacRioIJo3XHsBLyMKOBUsFHRTTVLCB2QHusYGpOIaS8bHzLCR0YJcCiVeTHgsfq7IyORznc1lRGBvp71cvE/r5NCeO5lPE5SYDGdDApTgUHiPBUccMUoiKEhhCpudsW0TxShYLIrmhCc2ZPnSbNacU4q1ZvTcs2exlFA++gQHSMHnaEaukJ11EAUPaJn9IrerCfrxXq3PialC9a0Zw/9gfX5AzGgmgI= AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwVZIq6LLgxmUF+4AmlMlk0g6dTMLMjVBD8VfcuFDErf/hzr9x0mahrQeGOZxzL3PmBKngGhzn21pZXVvf2KxsVbd3dvf27YPDjk4yRVmbJiJRvYBoJrhkbeAgWC9VjMSBYN1gfFP43QemNE/kPUxS5sdkKHnEKQEjDexjL0hEqCexuXIvZALIFA/smlN3ZsDLxC1JDZVoDewvL0xoFjMJVBCt+66Tgp8TBZwKNq16mWYpoWMyZH1DJYmZ9vNZ+ik+M0qIo0SZIwHP1N8bOYl1EdBMxgRGetErxP+8fgbRtZ9zmWbAJJ0/FGUCQ4KLKnDIFaMgJoYQqrjJiumIKELBFFY1JbiLX14mnUbdvag37i5rTaeso4JO0Ck6Ry66Qk10i1qojSh6RM/oFb1ZT9aL9W59zEdXrHLnCP2B9fkD7AKVdg== p - v a l u e ( LB ) AAACFXicbVDLSgMxFM3Ud31VXboJFqGClhkVdCm6ceFCwT6gLSWT3mlDMw+SO2IZ5ifc+CtuXCjiVnDn35gZu9DWAyEn59xL7j1uJIVG2/6yCjOzc/MLi0vF5ZXVtfXSxmZdh7HiUOOhDFXTZRqkCKCGAiU0IwXMdyU03OFF5jfuQGkRBrc4iqDjs34gPMEZGqlb2m+7oezpkW+uJEppG+EeXS85uGMyBlrJ34jJ1Xm6l3ZLZbtq56DTxBmTMhnjulv6bPdCHvsQIJdM65ZjR9hJmELBJaTFdqwhYnzI+tAyNGA+6E6Sb5XSXaP0qBcqcwKkufq7I2G+zgY3lT7DgZ70MvE/rxWjd9pJRBDFCAH/+ciLJcWQZhHRnlDAUY4MYVwJMyvlA6YYRxNk0YTgTK48TeqHVeeoenhzXD6zx3Eskm2yQyrEISfkjFySa1IjnDyQJ/JCXq1H69l6s95/SgvWuGeL/IH18Q2MCZ+b p - v a l u e ( DW ) AAACFXicbVDLSgMxFM3UV62vqks3wSJU0DJTBV0WdOGygn1AW0omvdOGZh4kd4plmJ9w46+4caGIW8Gdf2P6WGj1QMjJOfeSe48bSaHRtr+szNLyyupadj23sbm1vZPf3avrMFYcajyUoWq6TIMUAdRQoIRmpID5roSGO7ya+I0RKC3C4A7HEXR81g+EJzhDI3XzJ203lD099s2VRCltI9yj6yWnIyZjoMXpGzG5bqTHaTdfsEv2FPQvceakQOaodvOf7V7IYx8C5JJp3XLsCDsJUyi4hDTXjjVEjA9ZH1qGBswH3UmmW6X0yCg96oXKnADpVP3ZkTBfTwY3lT7DgV70JuJ/XitG77KTiCCKEQI++8iLJcWQTiKiPaGAoxwbwrgSZlbKB0wxjibInAnBWVz5L6mXS85ZqXx7XqjY8ziy5IAckiJxyAWpkBtSJTXCyQN5Ii/k1Xq0nq03631WmrHmPfvkF6yPb5/cn6g= = 0 . AAACBnicbVDLSsNAFJ3UV62vqEsRBovQVUiqoBuh4MZlBfuAJobJZNIOnWTCzEQooSs3/oobF4q49Rvc+TdO2iy09cAwh3Pu5d57gpRRqWz726isrK6tb1Q3a1vbO7t75v5BV/JMYNLBnHHRD5AkjCako6hipJ8KguKAkV4wvi783gMRkvLkTk1S4sVomNCIYqS05JvHbsBZKCex/nJX0mGMfPu+Ca+gbTlT36zblj0DXCZOSeqgRNs3v9yQ4ywmicIMSTlw7FR5ORKKYkamNTeTJEV4jIZkoGmCYiK9fHbGFJ5qJYQRF/olCs7U3x05imWxqa6MkRrJRa8Q//MGmYouvZwmaaZIgueDooxBxWGRCQypIFixiSYIC6p3hXiEBMJKJ1fTITiLJy+TbtNyzqzm7Xm91SjjqIIjcAIawAEXoAVuQBt0AAaP4Bm8gjfjyXgx3o2PeWnFKHsOwR8Ynz+sQ5fa = 0 . AAACBnicbVDLSsNAFJ3UV62vqEsRBovQVUiqoBuh4MZlBfuAJobJZNIOnWTCzEQooSs3/oobF4q49Rvc+TdO2iy09cAwh3Pu5d57gpRRqWz726isrK6tb1Q3a1vbO7t75v5BV/JMYNLBnHHRD5AkjCako6hipJ8KguKAkV4wvi783gMRkvLkTk1S4sVomNCIYqS05JvHbsBZKCex/nJX0mGMfPu+Ca+gbTWnvlm3LXsGuEycktRBibZvfrkhx1lMEoUZknLg2KnyciQUxYxMa24mSYrwGA3JQNMExUR6+eyMKTzVSggjLvRLFJypvztyFMtiU10ZIzWSi14h/ucNMhVdejlN0kyRBM8HRRmDisMiExhSQbBiE00QFlTvCvEICYSVTq6mQ3AWT14m3ablnFnN2/N6q1HGUQVH4AQ0gAMuQAvcgDboAAwewTN4BW/Gk/FivBsf89KKUfYcgj8wPn8ArciX2w== Figure 4: Performance of our proposed method when δ increases. The experimental settingis on the top of each column. The red dashed line denotes the ground truth α = 0 . . Notethat the δ selected by our proposed tuning procedure (which leads to the maximum p -valueof Ljung-Box or Durbin-Watson test) gives the best estimate (cid:98) α .13 xperiment 2. Next, we validate our theoretical ﬁndings for ar (1) case. The dynamicbackground is generated in the same way as the previous example. Besides, Figure 4 showsthat the p -value with respect to δ is unimodal, which enables us to use the Golden-sectionsearch (tolerance ε = 0 . ) to tune δ eﬃciently. Details on the Golden-section search and thismodiﬁed tuning procedure can be found in Appendix B.2. We also show how the estimate (cid:98) α behaves with changing s , by setting α = 0 . , σ = 0 . , δ = 0 . and repeating thesame estimation procedure 20 times for each s ∈ { , , . . . , } . The mean and standarddeviation of (cid:98) α over 20 trials with respect to s in an errorbar plot are plotted Figure 5. = 0.1, = 0.1, = 0.1, = 1e-05 M e a n ( S t d ) o f o v e r t r i a l s ˆ ↵ AAACA3icbVDLSsNAFJ34rPUVdaebwSK4KkkVdFlw47KCfUATws1k0gydPJiZCCUU3Pgrblwo4tafcOffOGmz0NYDwxzOuZd77/EzzqSyrG9jZXVtfWOztlXf3tnd2zcPDnsyzQWhXZLyVAx8kJSzhHYVU5wOMkEh9jnt++Ob0u8/UCFZmtyrSUbdGEYJCxkBpSXPPHb8lAdyEuuvcCJQ2AGeReDZU89sWE1rBrxM7Io0UIWOZ345QUrymCaKcJByaFuZcgsQihFOp3UnlzQDMoYRHWqaQEylW8xumOIzrQQ4TIV+icIz9XdHAbEs19SVMahILnql+J83zFV47RYsyXJFEzIfFOYcqxSXgeCACUoUn2gCRDC9KyYRCCBKx1bXIdiLJy+TXqtpXzRbd5eNdquKo4ZO0Ck6Rza6Qm10izqoiwh6RM/oFb0ZT8aL8W58zEtXjKrnCP2B8fkDQJOX2Q== s AAAB9XicbVDNSgMxGPy2/tX6V/XoJVgET2W3CvZY8OKxgm2Fdi3ZbLYNzSZLklXK0vfw4kERr76LN9/GbLsHbR0IGWa+j0wmSDjTxnW/ndLa+sbmVnm7srO7t39QPTzqapkqQjtEcqnuA6wpZ4J2DDOc3ieK4jjgtBdMrnO/90iVZlLcmWlC/RiPBIsYwcZKD4NA8lBPY3tlejas1ty6OwdaJV5BalCgPax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qoqafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKuk26t5FvXF7WWs1izrKcAKncA4eXEELbqANHSCg4Ble4c15cl6cd+djMVpyip1j+APn8wdJ/5L+ s AAAB+XicbVDLSgMxFL3js9bXqEs3wSK4KjNV0GXBjcsK9gHtMGQyaRuaSYYkUyhD/8SNC0Xc+ifu/Bsz7Sy09UDI4Zx7ycmJUs608bxvZ2Nza3tnt7JX3T84PDp2T047WmaK0DaRXKpehDXlTNC2YYbTXqooTiJOu9HkvvC7U6o0k+LJzFIaJHgk2JARbKwUuu4gkjzWs8ReuQ69eejWvLq3AFonfklqUKIVul+DWJIsocIQjrXu+15qghwrwwin8+og0zTFZIJHtG+pwAnVQb5IPkeXVonRUCp7hEEL9fdGjhNdhLOTCTZjveoV4n9ePzPDuyBnIs0MFWT50DDjyEhU1IBipigxfGYJJorZrIiMscLE2LKqtgR/9cvrpNOo+9f1xuNNrdko66jAOVzAFfhwC014gBa0gcAUnuEV3pzceXHenY/l6IZT7pzBHzifP+6+k8w= Figure 5: Algorithmic behavior with respect to s . The red dashed horizontal line is theground truth α = 0 . . The estimate starts to deteriorate when s exceeds s ≈ andbecomes worse with larger s .The results in Figure 5 show that indeed when s and δ are inside the recoverable region,the estimation error is small, and it will grow with an increasing s . Moreover, the errorremains small for relatively small s , but once s exceeds s the error starts to increase; that’swhen the non-stationary series is not in the ε -recoverable region. This observation agrees withour non-asymptotic bounds on estimation error. Moreover, we conduct similar experimentsfor ar (2) case to validate these ﬁndings for a more general case; the results can be found inAppendix D. Experiment 3.

We compare our method with the method in Zhang et al. [2020]. Inthe following, we refer to their method as the “ (cid:96) variant,” since it is obtained by solvingthe convex program with same objective function as (2) except for a diﬀerent constraint: (cid:80) T − i =1 ( f i +1 − f i ) < δ. Again, δ ≥ is the tuning parameter. We should mention Zhang et al.[2020] did not have a systematic way to tune δ and here we enhance their method by addingour statistical test based hyperparameter tuning as well.The piecewise linear dynamic background is generated by f i = (cid:80) ik =1 δ ( U k − . , i =1 , . . . , T . Here U k = u i for all k ∈ { k i , . . . , k i +1 } , i = 0 , . . . , s − , where k < k < · · ·

200 400 600 800 1000 1200 1400 1600 1800 2000-10-8-6-4-2024 = 0.86595, = 46.9253, p = 0, = 0.000965560 200 400 600 800 1000 1200 1400 1600 1800 2000-10-8-6-4-202 0 200 400 600 800 1000 1200 1400 1600 1800 2000-10-8-6-4-2024 = 0.11667, = 46.9253, p = 6.34e-11, = 0.000965560 200 400 600 800 1000 1200 1400 1600 1800 2000-10-8-6-4-202 T TT T = 1 . , ˆ ↵ = 0 . AAACHXicbVDLSgMxFM34rPVVdekmWAQXMszYUutCKLhxWcFaoTOUO5m0DWYeJHeEMvRH3Pgrblwo4sKN+DemdRa+DoQczrmX5JwglUKj43xYc/MLi0vLpZXy6tr6xmZla/tKJ5livMMSmajrADSXIuYdFCj5dao4RIHk3eDmbOp3b7nSIokvcZxyP4JhLAaCARqpX6l7QSJDPY7MlXshlwj0lLr2Se2QepR6I0DqgUxH0HeN4djNRmPSr1Qd25mB/iVuQaqkQLtfefPChGURj5FJ0LrnOin6OSgUTPJJ2cs0T4HdwJD3DI0h4trPZ+kmdN8oIR0kypwY6Uz9vpFDpKcBzGQEONK/van4n9fLcND0cxGnGfKYfT00yCTFhE6roqFQnKEcGwJMCfNXykaggKEptGxKcH9H/kuujmy3Zh9d1Kutg6KOEtkle+SAuOSYtMg5aZMOYeSOPJAn8mzdW4/Wi/X6NTpnFTs75Aes90+CsJ7o = 1 . , ˆ ↵ = 0 . AAACHXicbVDLSgMxFM34rPVVdekmWAQXMkzagroQBDcuK1gVOqXcyaRtaOZBckcoQ3/Ejb/ixoUiLtyIf2OmdqHWAyGHc+4lOSdIlTToeZ/O3PzC4tJyaaW8ura+sVnZ2r42Saa5aPFEJfo2ACOUjEULJSpxm2oBUaDETTA8L/ybO6GNTOIrHKWiE0E/lj3JAa3UrTT8IFGhGUX2yv1QKAR6Spl7Uj+kPqX+AJD6oNIBdJk1PJexo3G3UvVcbwI6S9iUVMkUzW7l3Q8TnkUiRq7AmDbzUuzkoFFyJcZlPzMiBT6EvmhbGkMkTCefpBvTfauEtJdoe2KkE/XnRg6RKQLYyQhwYP56hfif186wd9zJZZxmKGL+/VAvUxQTWlRFQ6kFRzWyBLiW9q+UD0ADR1to2ZbA/kaeJdc1l9Xd2mWjenYwraNEdskeOSCMHJEzckGapEU4uSeP5Jm8OA/Ok/PqvH2PzjnTnR3yC87HF3Hmnt0= = 0.097746, = 25.1619, p = 0.4161, = 0.000965560 200 400 600 800 1000 1200 1400 1600 1800 2000-0.6-0.4-0.200.20.40.6 0 200 400 600 800 1000 1200 1400 1600 1800 2000-1.5-1-0.500.511.5 = 0.054483, = 25.1619, p = 0.23279, = 3.4442e-050 200 400 600 800 1000 1200 1400 1600 1800 2000-0.6-0.4-0.200.20.40.6 T TT T = 6 . ⇥ , ˆ ↵ = 0 . AAACLHicbVDJSgNBEO1xjXGLevTSGAQPGmai0VyEQC4eFcwCmTHU9HRMY89Cd40QhnyQF39FEA+KePU77ElycHvQ9OPVK6rq+YkUGm37zZqbX1hcWi6sFFfX1jc2S1vbbR2nivEWi2Wsuj5oLkXEWyhQ8m6iOIS+5B3/rpnXO/dcaRFH1zhKuBfCbSQGggEaqV9qun4sAz0KzZe5AZcI9JyeVup16qIIuaaOfZMdVceH1KXuEJC6IJMh9B1jsyt27aQ27pfKhk1A/xJnRspkhst+6dkNYpaGPEImQeueYyfoZaBQMMnHRTfVPAF2B7e8Z2gEZg8vmxw7pvtGCeggVuZFSCfq944MQp3fY5wh4FD/ruXif7VeioO6l4koSZFHbDpokEqKMc2To4FQnKEcGQJMCbMrZUNQwNDkWzQhOL9P/kva1YpzXKlenZQbB7M4CmSX7JED4pAz0iAX5JK0CCMP5Im8kjfr0Xqx3q2PqXXOmvXskB+wPr8AT/ukZQ== = 1 . , ˆ ↵ = 0 . AAACHnicbVDLSgMxFM3UV62vUZdugkXoQspMq9QuhIIblwrWFjql3MmkbWjmQXJHKEO/xI2/4saFIoIr/RvT2oVaD4QczrmX5Bw/kUKj43xauaXlldW1/HphY3Nre8fe3bvVcaoYb7JYxqrtg+ZSRLyJAiVvJ4pD6Eve8kcXU791x5UWcXSD44R3QxhEoi8YoJF69qnnxzLQ49BcmRdwiUDPqVuuV+kx9ag3BKQeyGQIPdcYTtmp12qTnl00bAa6SNw5KZI5rnr2uxfELA15hEyC1h3XSbCbgULBJJ8UvFTzBNgIBrxjaAQh191sFm9Cj4wS0H6szImQztSfGxmEeprATIaAQ/3Xm4r/eZ0U+2fdTERJijxi3w/1U0kxptOuaCAUZyjHhgBTwvyVsiEoYGgaLZgS3L+RF8ltpexWy5Xrk2KjNK8jTw7IISkRl9RIg1ySK9IkjNyTR/JMXqwH68l6td6+R3PWfGef/IL18QUFHJ8l Fitted AR(1) modelFitted backgroundObservationTrue background

TV-LSE in Case 2 AAACC3icbVDLSgMxFM3UV62vqks3oUVwY5mpgi4LRXDhomJf0A4lk2ba0ExmSO6IZejejb/ixoUibv0Bd/6NaTsLbT0Qcjjn3pvc40WCa7Dtbyuzsrq2vpHdzG1t7+zu5fcPmjqMFWUNGopQtT2imeCSNYCDYO1IMRJ4grW8UXXqt+6Z0jyUdRhHzA3IQHKfUwJG6uULXWAP4PnJ7AZI6s3Tm7urCeYSV81cXJ708kW7ZM+Al4mTkiJKUevlv7r9kMYBk0AF0brj2BG4CVHAqWCTXDfWLCJ0RAasY6gkAdNuMttlgo+N0sd+qMyRgGfq746EBFqPA89UBgSGetGbiv95nRj8SzfhMoqBSTp/yI8FhhBPg8F9rhgFMTaEUMXNXzEdEkUomPhyJgRnceVl0iyXnLNS+fa8WLHTOLLoCBXQCXLQBaqga1RDDUTRI3pGr+jNerJerHfrY16asdKeQ/QH1ucPP9Cacw==

TV-LSE in Case 1 AAACC3icbVDLSgMxFM3UV62vqks3oUVwY5mpgi4LRXDhomJf0A4lk2ba0ExmSO6IZejejb/ixoUibv0Bd/6NaTsLbT0Qcjjn3pvc40WCa7Dtbyuzsrq2vpHdzG1t7+zu5fcPmjqMFWUNGopQtT2imeCSNYCDYO1IMRJ4grW8UXXqt+6Z0jyUdRhHzA3IQHKfUwJG6uULXWAP4PnJ7AZI6s3Tm7urCeYSV81c7Ex6+aJdsmfAy8RJSRGlqPXyX91+SOOASaCCaN1x7AjchCjgVLBJrhtrFhE6IgPWMVSSgGk3me0ywcdG6WM/VOZIwDP1d0dCAq3HgWcqAwJDvehNxf+8Tgz+pZtwGcXAJJ0/5McCQ4inweA+V4yCGBtCqOLmr5gOiSIUTHw5E4KzuPIyaZZLzlmpfHterNhpHFl0hAroBDnoAlXQNaqhBqLoET2jV/RmPVkv1rv1MS/NWGnPIfoD6/MHPkuacg== ` variant in Case 1 AAACF3icbVDLSgNBEJz1GeMr6tHLYBA8hd0o6DGQi8cI5gHZEGYnvcmQ2dllpjcYlvyFF3/FiwdFvOrNv3HyOGhiwzBFVTddXUEihUHX/XbW1jc2t7ZzO/ndvf2Dw8LRccPEqeZQ57GMdStgBqRQUEeBElqJBhYFEprBsDrVmyPQRsTqHscJdCLWVyIUnKGluoWSH8SyZ8aR/TIfpOyWJz7CAwZhRkdMC6aQCkWrdgf1Jt1C0S25s6KrwFuAIllUrVv48nsxTyNQyCUzpu25CXYyplFwCZO8nxpIGB+yPrQtVCwC08lmd03ouWV6NIy1fdbGjP09kbHITK3bzojhwCxrU/I/rZ1ieNPJhEpSBMXni8JUUozpNCTaExo4yrEFjGthvVI+YJpxtFHmbQje8smroFEueZel8t1VseIu4siRU3JGLohHrkmF3JIaqRNOHskzeSVvzpPz4rw7H/PWNWcxc0L+lPP5A2iWn/g= ` variant in Case 2 AAACF3icbVDLSgNBEJz1GeMr6tHLYBA8hd0o6DGQi8cI5gHZEGYnvcmQ2dllpjcYlvyFF3/FiwdFvOrNv3HyOGhiwzBFVTddXUEihUHX/XbW1jc2t7ZzO/ndvf2Dw8LRccPEqeZQ57GMdStgBqRQUEeBElqJBhYFEprBsDrVmyPQRsTqHscJdCLWVyIUnKGluoWSH8SyZ8aR/TIfpOyWJz7CAwZhRkdMC6aQCkWrdgctT7qFoltyZ0VXgbcARbKoWrfw5fdinkagkEtmTNtzE+xkTKPgEiZ5PzWQMD5kfWhbqFgEppPN7prQc8v0aBhr+6yNGft7ImORmVq3nRHDgVnWpuR/WjvF8KaTCZWkCIrPF4WppBjTaUi0JzRwlGMLGNfCeqV8wDTjaKPM2xC85ZNXQaNc8i5L5burYsVdxJEjp+SMXBCPXJMKuSU1UiecPJJn8krenCfnxXl3Puata85i5oT8KefzB2obn/k= Figure 6: Comparison of proposed method and the (cid:96) variant in Zhang et al. [2020]. Weinvestigated two cases. In Case 1, the dynamic background oscillates more but a with smallermagnitude, whereas in Case 2, the dynamic background is smoother but has larger one-stepdiﬀerence. Our piecewise constant ﬁtted background better captures the dynamics in Case 1,whereas the (cid:96) variant can better approximate the dynamics in Case 2.region. The (cid:96) variant yields a biased estimate for α , which is probably the reason thatZhang et al. [2020] focus on relatively smooth and structured dynamics.In Case 2, even though the dynamic background is smoother than the previous example,the dynamic background changes drastically (large (cid:107) ∆ (cid:107) ). Thus, in this case, the piecewiseconstant function is a poorer approximation to the dynamic background. This type of sequenceis outside the recoverable region, and our proposed method may not work well for thosesequences. Nevertheless, the (cid:96) variant, together with our proposed hyperparameter tuningprocedure, performs well in recovering the serial dependence and serves as an alternative to ourproposed estimator. This result agrees with Zhang et al. [2020], where they demonstrated thegood performance of this (cid:96) variant when dealing with relatively structured dynamics, since (cid:96) constraint can lead to a smooth background. In addition, we should mention a polynomialapproximation method used in Xu [2008] does not perform well in ﬁtting unstructureddynamics and can hardly compete with these two aforementioned non-parametric methods.The numerical comparison with this polynomial method can be found at Appendix D. Experiment 4.

Finally, we compare the conﬁdence intervals obtained via two bootstrapmethods. We adopt the following experimental setting: α = 0 . , σ = 0 . , T = 1000 . Thedynamic drift is piecewise constant with δ = 0 . , s = 100 . The bootstrap replication is N = 100 ; we use standard normal random numbers as v t ’s in residual-based wild bootstrap;for local block bootstrap, we choose block size b = 20 and local neighborhood length B = 50 .We illustrate one replication result by plotting the histogram of (cid:98) α ’s from bootstrap samplesin Figure 7.From 50 repetition of the above procedure, we ﬁnd that: (i) the coverage accuracy of 90%and 95% conﬁdence intervals: 0.84 and 0.90 for residual-based wild bootstrap and 0.84 and0.88 for local block bootstrap; (ii) the average lengths of 90% and 95% conﬁdence intervals:0.10 and 0.12 for residual-based wild bootstrap and 0.095 and 0.114 for local block bootstrap.The coverage accuracy is slightly lower than the theoretical value since T = 1000 is relativelysmall. The comparison indicates that local block bootstrap tends to yield smaller conﬁdenceintervals but has slightly lower coverage accuracy.15 esidual-based wild bootstrap Local Block Bootstrap90% CI = [0.0281, 0.134]95% CI = [0.0248, 0.15] 90% CI = [0.0439, 0.139]95% CI = [0.0115, 0.154] Estimate from actual sampleGround-truthBoundaries for 90% CIBoundaries for 95% CI ↵ = 1 AAAB83icbVBNS8NAEJ34WetX1aOXxSJ4KkkV9CIUvHisYD+gCWWy3bRLN5uwuxFK6N/w4kERr/4Zb/4bt20O2vpg4PHeDDPzwlRwbVz321lb39jc2i7tlHf39g8OK0fHbZ1kirIWTUSiuiFqJrhkLcONYN1UMYxDwTrh+G7md56Y0jyRj2aSsiDGoeQRp2is5Pso0hH2PXJLvH6l6tbcOcgq8QpShQLNfuXLHyQ0i5k0VKDWPc9NTZCjMpwKNi37mWYp0jEOWc9SiTHTQT6/eUrOrTIgUaJsSUPm6u+JHGOtJ3FoO2M0I73szcT/vF5mopsg5zLNDJN0sSjKBDEJmQVABlwxasTEEqSK21sJHaFCamxMZRuCt/zyKmnXa95lrf5wVW24RRwlOIUzuAAPrqEB99CEFlBI4Rle4c3JnBfn3flYtK45xcwJ/IHz+QNZIZCE Figure 7: Histogram of (cid:98) α ’s from residual-based wild bootstrap samples (left) and local blockbootstrap samples (right). We can see that the local block bootstrap method yields a smallerconﬁdence interval but lower coverage accuracy. To validate its performance, we apply our proposed method to real data from a psychologicalexperiment. Consider a reaction time (RT) dataset collected from human subjects. The dataare taken from a publicly available database introduced by Rahnev et al. [2020] with 149individual datasets with human data on diﬀerent tasks. Here we only analyze a single datasetnamed

Maniscalco_2017_expt1 chosen based on the fact that it has RT data included andfeatures a large number of trials per subject.The data come from an experiment where human subjects made a series of 1000 perceptualjudgments over a period of about one hour. Participants were seated in front of a computerand made their responses using a standard keyboard. The task, which is standard in the ﬁeld,consisted of deciding whether a brieﬂy presented (33 ms) noisy sinusoidal grating was orientedclockwise or counterclockwise from vertical. Subjects responded as quickly as possible butwithout sacriﬁcing accuracy. The experimenters recorded each judgment’s reaction time (thatis, the time from the onset of the visual stimulus to the button press used to indicate thesubject’s response), thus creating a time series of 1000 values for each subject. Data wereobtained from 28 subjects.We ﬁrst pre-process the raw data by dealing with missing values and obvious outliers.To be precise, we treat RTs that exceed 10 times the interquartile range (i.e., the diﬀerencebetween 75th and 25th percentiles) as outliers and the rest as normal observations. Sincenaively omitting missing data in time series data will break the serial correlation, we usethe median of the normal observations to impute those missing values. The same median isused to replace all outliers. We propose a data-adaptive procedure to tune δ by applying theLjung-Box test on the logarithm of original residuals since they are strongly right-skewed.We plot the results for all 28 subjects in Figure 8. More details on why we choose logarithm16

200 400 600 800 10000246 = 8.29, = 0.057WB: 90% CI = [-0.0118, 0.16], 95% CI = [-0.019, 0.267]LBB: 90% CI = [0.0116, 0.102], 95% CI = [0.00446, 0.104] 0 200 400 600 800 100000.511.52 = 15.1, = -0.00989WB: 90% CI = [-0.069, 0.0341], 95% CI = [-0.0765, 0.0435]LBB: 90% CI = [-0.0748, 0.0438], 95% CI = [-0.0947, 0.0606] 0 200 400 600 800 100000.511.52 = 0.0999, = 0.0177WB: 90% CI = [-0.027, 0.0528], 95% CI = [-0.0339, 0.0627]LBB: 90% CI = [-0.0499, 0.0622], 95% CI = [-0.0678, 0.0738] 0 200 400 600 800 10000123 = 0.599, = 0.262WB: 90% CI = [0.22, 0.315], 95% CI = [0.212, 0.326]LBB: 90% CI = [0.188, 0.309], 95% CI = [0.171, 0.32]0 200 400 600 800 100000.511.52 = 11.6, = 0.207WB: 90% CI = [0.126, 0.222], 95% CI = [0.116, 0.231]LBB: 90% CI = [0.142, 0.262], 95% CI = [0.123, 0.272] 0 200 400 600 800 100000.511.52 = 0.2, = 0.058WB: 90% CI = [0.0343, 0.0765], 95% CI = [0.0304, 0.0801]LBB: 90% CI = [-0.00283, 0.09], 95% CI = [-0.0103, 0.0984] 0 200 400 600 800 100000.511.52 = 0.2, = 0.109WB: 90% CI = [0.0676, 0.148], 95% CI = [0.0613, 0.176]LBB: 90% CI = [0.038, 0.14], 95% CI = [0.0203, 0.148] 0 200 400 600 800 10000123 = 0.4, = 0.175WB: 90% CI = [0.128, 0.255], 95% CI = [0.116, 0.268]LBB: 90% CI = [0.105, 0.219], 95% CI = [0.0904, 0.243]0 200 400 600 800 100001234 = 15.1, = -0.0344WB: 90% CI = [-0.0747, 0.0357], 95% CI = [-0.0838, 0.0471]LBB: 90% CI = [-0.085, 0.0194], 95% CI = [-0.0923, 0.0232] 0 200 400 600 800 100000.511.5 = 1.3, = 0.166WB: 90% CI = [0.12, 0.217], 95% CI = [0.107, 0.225]LBB: 90% CI = [0.107, 0.217], 95% CI = [0.0761, 0.218] 0 200 400 600 800 10000.511.5 = 0.2, = 0.163WB: 90% CI = [0.127, 0.192], 95% CI = [0.123, 0.203]LBB: 90% CI = [0.0721, 0.189], 95% CI = [0.0661, 0.198] 0 200 400 600 800 10000123 = 1.4, = 0.0673WB: 90% CI = [0.022, 0.106], 95% CI = [0.0159, 0.13]LBB: 90% CI = [0.0305, 0.122], 95% CI = [0.0264, 0.126]0 200 400 600 800 100001234 = 30.1, = 1.96e-5WB: 90% CI = [-0.0803, 0.0699], 95% CI = [-0.0955, 0.0841]LBB: 90% CI = [-0.0708, 0.0605], 95% CI = [-0.0786, 0.0731] 0 200 400 600 800 100000.511.52 = 2.5, = -0.0627WB: 90% CI = [-0.114, -0.00908], 95% CI = [-0.121, 0.0101]LBB: 90% CI = [-0.119, -0.0274], 95% CI = [-0.129, -0.0172] 0 200 400 600 800 100000.511.52 = 0.3, = 0.157WB: 90% CI = [0.124, 0.204], 95% CI = [0.107, 0.225]LBB: 90% CI = [0.0862, 0.175], 95% CI = [0.0757, 0.18] 0 200 400 600 800 10000123 = 0.3, = 0.0727WB: 90% CI = [0.0411, 0.122], 95% CI = [0.0356, 0.146]LBB: 90% CI = [0.0212, 0.105], 95% CI = [0.0141, 0.114]0 200 400 600 800 10000246 = 13.0, = -0.0573WB: 90% CI = [-0.106, 0.00226], 95% CI = [-0.138, 0.0107]LBB: 90% CI = [-0.104, -0.0139], 95% CI = [-0.122, -0.00502] 0 200 400 600 800 100000.511.52 = 15.1, = -0.141WB: 90% CI = [-0.19, -0.0911], 95% CI = [-0.191, -0.0701]LBB: 90% CI = [-0.185, -0.0895], 95% CI = [-0.194, -0.0752] 0 200 400 600 800 100001234 = 0.0999, = 0.167WB: 90% CI = [0.131, 0.201], 95% CI = [0.12, 0.211]LBB: 90% CI = [0.0932, 0.224], 95% CI = [0.0836, 0.239] 0 200 400 600 800 100000.51 = 0.699, = 0.219WB: 90% CI = [0.169, 0.255], 95% CI = [0.157, 0.304]LBB: 90% CI = [0.135, 0.289], 95% CI = [0.126, 0.299]0 200 400 600 800 100001234 = 15.1, = -0.0348WB: 90% CI = [-0.115, 0.0322], 95% CI = [-0.124, 0.0367]LBB: 90% CI = [-0.0937, 0.00807], 95% CI = [-0.108, 0.0199] 0 200 400 600 800 100001234 = 1.1, = 0.0521WB: 90% CI = [1.34e-4, 0.0892], 95% CI = [-0.00231, 0.117]LBB: 90% CI = [0.00312, 0.0894], 95% CI = [-0.00743, 0.101] 0 200 400 600 800 100000.511.52 = 2.3, = 0.156WB: 90% CI = [0.0926, 0.208], 95% CI = [0.0778, 0.216]LBB: 90% CI = [0.0958, 0.199], 95% CI = [0.0862, 0.201] 0 200 400 600 800 10000123 = 9.69, = 0.109WB: 90% CI = [0.0242, 0.13], 95% CI = [0.02, 0.135]LBB: 90% CI = [0.0524, 0.149], 95% CI = [0.0468, 0.157]0 200 400 600 800 100000.511.52 = 8.19, = 0.0391WB: 90% CI = [-0.0172, 0.0759], 95% CI = [-0.0353, 0.105]LBB: 90% CI = [-0.0396, 0.0981], 95% CI = [-0.0497, 0.11] 0 200 400 600 800 100001234 = 0.999, = 0.0485WB: 90% CI = [0.02, 0.114], 95% CI = [0.0124, 0.15]LBB: 90% CI = [-0.0182, 0.0852], 95% CI = [-0.0216, 0.0932] 0 200 400 600 800 10000123 = 6.39, = -0.0215WB: 90% CI = [-0.0773, 0.0258], 95% CI = [-0.0891, 0.0365]LBB: 90% CI = [-0.0685, 0.0107], 95% CI = [-0.078, 0.0131] 0 200 400 600 800 10000123 = 2.1, = 0.092WB: 90% CI = [0.0258, 0.128], 95% CI = [0.0175, 0.153]LBB: 90% CI = [0.0377, 0.142], 95% CI = [0.0234, 0.148] Figure 8: Experimental results on reaction times for all 28 subjects. Subjects 1 to 28are organized in the order of left to right and top to bottom. The blue, red and yellowlines correspond to the raw RT values, ﬁtted ar (1) model and ﬁtted dynamic background,respectively. On the top of each ﬁgure, we report δ selected by Ljung-Box test on thelogarithm of residuals, estimated ar (1) coeﬃcient and 90% and 95% conﬁdence intervalsbased on residual-based wild bootstrap and local block bootstrap samples. Overall, weobserve the presence of substantial drift that varies signiﬁcantly between subjects but isrecovered very well by our proposed method.17ransform is in Appendix D. The conﬁdence intervals are constructed via bootstrapping withthe same bootstrapping parameters in our simulation. Regular AR(1) modelProposed method (90% CI) = 0 Subject ˆ ↵ AAAB+HicbVBNS8NAEN34WetHox69LBbBU0lqQY8FLx4r2A9oQphsN83SzSbsboQa+ku8eFDEqz/Fm//GbZuDtj4YeLw3w8y8MONMacf5tjY2t7Z3dit71f2Dw6OafXzSU2kuCe2SlKdyEIKinAna1UxzOsgkhSTktB9Obud+/5FKxVLxoKcZ9RMYCxYxAtpIgV0rvBg09oBnMQTuLLDrTsNZAK8TtyR1VKIT2F/eKCV5QoUmHJQauk6m/QKkZoTTWdXLFc2ATGBMh4YKSKjyi8XhM3xhlBGOUmlKaLxQf08UkCg1TULTmYCO1ao3F//zhrmObvyCiSzXVJDloijnWKd4ngIeMUmJ5lNDgEhmbsUkBglEm6yqJgR39eV10ms23KtG875Vb7fKOCroDJ2jS+Sia9RGd6iDuoigHD2jV/RmPVkv1rv1sWzdsMqZU/QH1ucPXg+S2g== Figure 9: Comparison between our proposed method and regular ar (1) model for all 28subjects. The errorbar is 90% conﬁdence interval computed using the local block bootstrapmethod. Regular ar (1) model typically yields larger (cid:98) α , since it confuses the dynamic driftsas autoregressive eﬀect.Overall, Figure 8 shows that our method can faithfully capture the underlying dynamics.More speciﬁcally, we make four distinct observations:• It is clear that there is a substantial background drift. Even in the raw data withoutany modeling, the drift can often be observed but is even more apparent after applyingour method of recovering it. Further, the drift is relatively smooth without big one-stepchanges, which is exactly the type of dynamical drift that our method can capture well.• The background drift has a complex shape and varies signiﬁcantly from person to person.While for some subjects, the RT series appears to be monotonically decreasing (e.g.,subjects 4, 6, 7, 8, 11, 12, 15, 16, 26, and 28) or even close to stationary (e.g., subjects3 and 19), the remaining subjects exhibit complex trends without any obvious pattern.These diﬀerences between subjects demonstrate that the trends need to be identiﬁed onthe individual time series level and cannot make strong structural assumptions aboutthe dynamic drift. Instead, to be able to capture real data, the dynamic backgroundhas to be modeled with minimal structural assumptions.• Our method of ﬁtting the background drift recovers reasonable estimate of the autore-gressive coeﬃcient α . Speciﬁcally, α is positive or close to 0 for most subjects, whichis expected given the extensive previous literature on RT [Laming, 1968]. Nevertheless,our method recovers a negative α for subject 18, which could indicate that the RTseries is not universally positively autocorrelated as assumed before and suggests theneed for more detailed investigations on this issue. Further, the proposed methodappears to provide a good ﬁt for the empirically observed RT data across individualsubjects, and the size of the hyperparameter δ tends to be larger for time series thatvisually appear to be less stationary. Thus, our method recovers both α and δ well,and provides a useful description of the data dynamics.• Our method provides a substantial improvement over the ar (1) model that is typicallyused to recover the autocorrelation coeﬃcient in psychology and neuroscience. Asshown in Figure 9, the ar (1) model leads to very high and clearly inﬂated estimates18f α because the model confuses the dynamical drift for an autocorrelation. Overall,our method performs very well on real data from experiments where it is likely to beapplied in the future and is a major advance over the standard ar (1) model. In this paper, we develop a total variation constrained least square estimator to estimate serialcorrelation in the presence of unknown and unstructured dynamic backgrounds. The methodapproximates the dynamic background via a piece-wise constant function and can approximatea wide range of highly unstructured dynamics. We also developed a statistically principledapproach based on the Ljung-Box test to select the tuning parameter. We establish theoreticalperformance guarantees of our method via upper error bound and develop performance lowerbound to characterize the condition for near-optimality (in (cid:96) estimation error sense) withinthe set of (cid:15) -recoverable sequences. Extensive numerical experiments validate our theory anddemonstrate the good performance of our method compared with the state-of-the-art. Weapply our method to a psychological study on human reaction times and ﬁnd that there isindeed substantial and complex-shaped drift in these data. The recovered autocorrelationvalues are generally positive, conﬁrming the long-hypothesized presence of serial dependencein human reaction times [Laming, 1968]. The proposed method is thus general and is likelyto receive wide adoption in both psychology and neuroscience in studying human and animaldecision making. Acknowledgement

The ﬁrst two authors are supported by NSF CCF-1650913, NSF DMS-1938106, and NSFDMS-1830210. The last author is supported by NIH R01MH119189, NIH R21MH122825,and the Oﬃce of Naval Research N00014-20-1-2622.

References

Athena Akrami, Charles D. Kopec, Mathew E. Diamond, and Carlos D. Brody. Posteriorparietal cortex represents sensory history and mediates its eﬀects on behaviour.

Nature ,554(7692):368–372, feb 2018. doi: 10.1038/nature25510. URL .Sumanta Basu and George Michailidis. Regularized estimation in sparse high-dimensionaltime series models.

The Annals of Statistics , 43(4):1535–1567, 2015.Peter J Bickel, Ya’acov Ritov, and Alexandre B Tsybakov. Simultaneous analysis of lassoand dantzig selector.

The Annals of Statistics , 37(4):1705–1732, 2009.Peter J Brockwell, Richard A Davis, and Stephen E Fienberg.

Time series: theory andmethods: theory and methods . Springer Science & Business Media, 1991.19uido Marco Cicchini, Kyriaki Mikellidou, and David C Burr. The func-tional role of serial dependence.

Proceedings of the Royal Society B: Bio-logical Sciences , 285(1890):20181722, nov 2018. ISSN 0962-8452. doi: 10.1098/rspb.2018.1722. URL .Peter K Clark. The cyclical component of us economic activity.

The Quarterly Journal ofEconomics , 102(4):797–814, 1987.James Durbin and Geoﬀrey S Watson. Testing for serial correlation in least squares regression.I. In

Breakthroughs in Statistics , pages 237–259. Springer, 1992.Gilles Dutilh, Don Van Ravenzwaaij, Sander Nieuwenhuis, Han L.J. Van der Maas, Birte U.Forstmann, and Eric Jan Wagenmakers. How to measure post-error slowing: A confound anda simple solution.

Journal of Mathematical Psychology , 56(3):208–216, 2012. ISSN 00222496.doi: 10.1016/j.jmp.2012.04.001. URL http://dx.doi.org/10.1016/j.jmp.2012.04.001 .Bradley Efron. Bootstrap methods: another look at the jackknife. In

Breakthroughs inStatistics , pages 569–593. Springer, 1992.Jason Fischer and David Whitney. Serial dependence in visual perception.

Nature Neu-roscience , 17(5):738–43, mar 2014. ISSN 1097-6256. doi: 10.1038/nn.3689. URL http://dx.doi.org/10.1038/nn.3689 .Michael Grant and Stephen Boyd. CVX: Matlab software for disciplined convex programming,version 2.1, 2014.James D Hamilton. A new approach to the economic analysis of nonstationary time seriesand the business cycle.

Econometrica: Journal of the Econometric Society , pages 357–384,1989.Zaıd Harchaoui and Céline Lévy-Leduc. Multiple change-point estimation with a totalvariation penalty.

Journal of the American Statistical Association , 105(492):1480–1493,2010.Robert J Hodrick and Edward C Prescott. Postwar US business cycles: an empiricalinvestigation.

Journal of Money, Credit, and Banking , pages 1–16, 1997.Yongmiao Hong. Serial correlation and serial dependence. In

Macroeconometrics and TimeSeries Analysis , pages 227–244. Springer, 2010.Anatoli Juditsky, Arkadi Nemirovski, Liyan Xie, and Yao Xie. Convex recovery of markedspatio-temporal point processes. arXiv preprint arXiv:2003.12935 , 2020.Anatoli B Juditsky and AS Nemirovski. Signal recovery by stochastic optimization.

Automa-tion and Remote Control , 80(10):1878–1893, 2019.Seung-Jean Kim, Kwangmoo Koh, Stephen Boyd, and Dimitry Gorinevsky. (cid:96) trend ﬁltering. SIAM review , 51(2):339–360, 2009. 20ans R Künsch. The jackknife and the bootstrap for general stationary observations.

TheAnnals of Statistics , 17(3):1217–1241, 1989.Donald R J Laming.

Information theory of choice-reaction times . Academic Press, New York,1968.Stephanie R Land and Jerome H Friedman. Variable fusion: A new adaptive signal regressionmethod. Technical report, Department of Statistics, Carnegie Mellon University, 1997.Greta M Ljung and George EP Box. On a measure of lack of ﬁt in time series models.

Biometrika , 65(2):297–303, 1978.Po-Ling Loh and Martin J Wainwright. High-dimensional regression with noisy and miss-ing data: Provable guarantees with non-convexity. In

Advances in Neural InformationProcessing Systems , pages 2726–2734, 2011.W. McIlhagga. Serial correlations and 1/f power spectra in visual search reaction times.

Journal of Vision , 8(9):5–5, jul 2008. ISSN 1534-7362. doi: 10.1167/8.9.5. URL http://jov.arvojournals.org/Article.aspx?doi=10.1167/8.9.5 .Nicolai Meinshausen and Bin Yu. Lasso-type recovery of sparse representations for high-dimensional data.

The Annals of Statistics , 37(1):246–270, 2009.Tobias J. Moskowitz, Yao Hua Ooi, and Lasse Heje Pedersen. Time series momentum.

Journal of Financial Economics , 104(2):228–250, may 2012. ISSN 0304405X. doi:10.1016/j.jﬁneco.2011.11.003. URL https://linkinghub.elsevier.com/retrieve/pii/S0304405X11002613 .Efstathios Paparoditis and Dimitris N Politis. Local block bootstrap.

Comptes RendusMathematique , 335(11):959–962, 2002.Dobromir Rahnev, Ai Koizumi, Li Yan McCurdy, Mark D’Esposito, and Hakwan Lau.Conﬁdence Leak in Perceptual Decision Making.

Psychological Science , 26(11):1664–1680,2015. ISSN 0956-7976. doi: 10.1177/0956797615595037. URL http://pss.sagepub.com/lookup/doi/10.1177/0956797615595037 .Dobromir Rahnev, Kobe Desender, Alan L. F. Lee, William T. Adler, David Aguilar-Lleyda,Başak Akdoğan, Polina Arbuzova, Lauren Y. Atlas, Fuat Balcı, Ji Won Bang, IndritBègue, Damian P. Birney, Timothy F. Brady, Joshua Calder-Travis, Andrey Chetverikov,Torin K. Clark, Karen Davranche, Rachel N. Denison, Troy C. Dildine, Kit S. Double,Yalçın A. Duyan, Nathan Faivre, Kaitlyn Fallow, Elisa Filevich, Thibault Gajdos, Regan M.Gallagher, Vincent de Gardelle, Sabina Gherman, Nadia Haddara, Marine Hainguerlot,Tzu-Yu Hsu, Xiao Hu, Iñaki Iturrate, Matt Jaquiery, Justin Kantner, Marcin Koculak,Mahiko Konishi, Christina Koß, Peter D. Kvam, Sze Chai Kwok, Maël Lebreton, Karolina M.Lempert, Chien Ming Lo, Liang Luo, Brian Maniscalco, Antonio Martin, Sébastien Massoni,Julian Matthews, Audrey Mazancieux, Daniel M. Merfeld, Denis O’Hora, Eleanor R. Palser,Borysław Paulewicz, Michael Pereira, Caroline Peters, Marios G. Philiastides, Gerit Pfuhl,Fernanda Prieto, Manuel Rausch, Samuel Recht, Gabriel Reyes, Marion Rouault, Jérôme21ackur, Saeedeh Sadeghi, Jason Samaha, Tricia X. F. Seow, Medha Shekhar, Maxine T.Sherman, Marta Siedlecka, Zuzanna Skóra, Chen Song, David Soto, Sai Sun, Jeroen J. A.van Boxtel, Shuo Wang, Christoph T. Weidemann, Gabriel Weindel, Michał Wierzchoń,Xinming Xu, Qun Ye, Jiwon Yeon, Futing Zou, and Ariel Zylberberg. The conﬁdencedatabase.

Nature Human Behaviour , 4(3):317–325, mar 2020. ISSN 2397-3374. doi: 10.1038/s41562-019-0813-1. URL .Garvesh Raskutti, Martin J Wainwright, and Bin Yu. Restricted eigenvalue properties forcorrelated gaussian designs.

The Journal of Machine Learning Research , 11:2241–2259,2010.Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. Sparsityand smoothness via the fused lasso.

Journal of the Royal Statistical Society: Series B(Methodological) , 67(1):91–108, 2005.Sara A Van De Geer and Peter Bühlmann. On the conditions used to prove oracle results forthe lasso.

Electronic Journal of Statistics , 3:1360–1392, 2009.Martin J Wainwright.

High-dimensional statistics: A non-asymptotic viewpoint , volume 48.Cambridge University Press, 2019.M. Wexler, M. Duyck, and P. Mamassian. Persistent states in vision break universality andtime invariance.

Proceedings of the National Academy of Sciences , 112(48):14990–5, nov2015. ISSN 0027-8424. doi: 10.1073/pnas.1508847112. URL .Chien-Fu Jeﬀ Wu. Jackknife, bootstrap and other resampling methods in regression analysis.

The Annals of Statistics , 14(4):1261–1295, 1986.Wei-Biao Wu and Ying Nian Wu. Performance bounds for parameter estimates of high-dimensional linear models with correlated errors.

Electronic Journal of Statistics , 10(1):352–379, 2016.Ke-Li Xu. Bootstrapping autoregression under non-stationary volatility.

The EconometricsJournal , 11(1):1–26, 2008.Kaimeng Zhang, Chi Tim Ng, and Myung Hwan Na. Real time prediction of irregular periodictime series data.

Journal of Forecasting , 39(3):501–511, 2020.22

Hyperparameter tuning and Bootstrap conﬁdence in-terval

Our proposed hyperparameter tuning procedure is: (i) Set an interval [ δ (cid:96) , δ u ] where we believethe best δ lies in based on prior knowledge; (ii) For any ε > , to make sure the Euclideandistance between selected δ and the optimal one is less than ε , we divide this interval into n = (cid:98) ( δ u − δ (cid:96) ) /ε (cid:99) parts with same length ε and denote the endpoints by δ (1) , . . . , δ ( n +1) ; (iii)For each δ ( j ) , we ﬁt the proposed estimator as deﬁned by (2) and construct the residualsequence by r i = x i − (cid:80) pj =1 (cid:98) α j x i − j − (cid:98) f i , i = 1 , . . . , T ; (iv) Apply (Lag-p) Ljung-Box test tothe residual sequence to obtain a p -value p j ; (v) The ε -optimal tuning parameter is δ ( j ) with j = argmax j ∈{ ,...,n +1 } p j . Further details on Ljung-Box test can be found in Section B.1 inAppendix B.Next, we present how to construct a bootstrap conﬁdence interval. For our ﬁrst methodresidual-based wild bootstrap: (i) we ﬁrst perform proposed tuning procedure to obtaintuning parameter δ and the corresponding estimates (cid:98) α j ’s and (cid:98) f i ’s; (ii) then we calculate theresiduals (cid:98) r i ’s as suggested in step 2.(i) in proposed tuning procedure; (iii) residual-based wildbootstrap sample is constructed recursively by (1) with (cid:98) α j ’s, (cid:98) f i ’s and (cid:101) r i = (cid:98) r i v i , where v i ’sare i.i.d. random numbers with zero mean and unit variance. As for local block bootstrap,we ﬁrst choose an integer block size b and local neighborhood size B . We partition T samples into M = (cid:100) T /b (cid:101) blocks. Then, for m = 0 , . . . , M − , the local block bootstrapsample is (cid:101) x mb + j = x I m + j − , j = 1 , . . . , b , where I m is a uniform random integer drawn from { max(1 , mb − B ) , . . . , min( T − b + 1 , mb + B ) } . In Paparoditis and Politis [2002], it is requiredthat (i) b/B → as b → ∞ ; (ii) when T → ∞ , T /B → but B → ∞ .After obtaining the bootstrap sample, we apply proposed tuning procedure to this pseudo-series with δ (cid:96) = δ − nε and δ u = δ + nε to obtain estimates (cid:101) α j ’s (we choose n = 2 in numericalsimulation). Then, we repeat this procedure N times to construct a conﬁdence interval bythe empirical distribution of (cid:101) α j ’s. For bootstrap samples, we only need to search around the ε -optimal δ for the optimal tuning parameter of the pseudo-series since it closely resemblesthe actual observation. This helps to reduce the computational cost of bootstrapping. B Background knowledge

B.1 Ljung-Box test and Durbin-Watson test

Ljung-Box test, sometimes known as the Ljung–Box Q test, is designed to test if there stillexhibits serial correlation in the residual sequence. The null hypothesis is H : The data areindependently distributed. The test statistic is Q = T ( T + 2) h (cid:88) k =1 (cid:98) ρ k n − k , T is the sample size, (cid:98) ρ k is the sample autocorrelation at lag k , and h is the number oflags being tested. For sequence { x , . . . , x T } , the sample autocorrelation (cid:98) ρ k is deﬁned as (cid:98) ρ k = (cid:98) γ ( k ) (cid:98) γ (0) , where (cid:98) γ ( k ) = 1 T T −| k | (cid:88) t =1 (cid:0) x t + | k | − ¯ x (cid:1) ( x t − ¯ x ) . Here, { x , . . . , x T } is residual sequence if one wants to implement Ljung-Box test. Under H ,the test statistic asymptotically follows a χ h ) distribution. The p -value of Ljung-Box test is pr( χ h ) > Q ) .Durbin-Watson test serves the same purpose. For residual e t = ρe t − + ν t , the test statisticis d = (cid:80) Tt =2 ( e t − e t − ) (cid:80) Tt =1 e t . It tests null hypothesis: H : ρ = 0 against alternative hypothesis H : ρ (cid:54) = 0 . B.2 Golden-section search

Golden-section search is a eﬃcient and robust technique for ﬁnding an extremum (minimumor maximum) of a function inside a speciﬁed interval. For any given δ , if we solve the convexprogram (2), calculate the residual sequence and perform the hypothesis test on it as wementioned in Section2.2, we will obtain a p -value. That is, we have a mapping that maps δ to p , which we denote as p = f ( δ ) . In our numerical experiment, we show that f is unimodalby Figure 4. Therefore, we can speed up the parameter tuning procedure by Golden-sectionsearch. The detailed steps are provided below in Algorithm 1.Compared to (cid:98) ( δ u − δ (cid:96) ) /ε (cid:99) + 1 searches in proposed tuning procedure, Golden-sectionsearch can achieve ε -optimality with just (cid:98) log( ε/ ( δ u − δ (cid:96) )) / log(0 . (cid:99) + 1 searches. C Proofs

C.1 Proof of Theorem 1

To begin with, we prove Theorem 1 by using Proposition 1:

Proof of Theorem 1.

Denote estimation error by e = (cid:98) β T − β . By triangle inequality, we have (cid:112) ( (cid:98) α − α ) + ( (cid:98) µ − µ ) = (cid:107) e I (cid:107) ≤ (cid:113) (cid:98) α + α + (cid:98) µ + µ ) ≤ δ s = 2 (cid:112) vol( S ) /π. By deﬁnition (14), φ min (2) is the smallest eigenvalue of (cid:101) X T (cid:101) X /T , where (cid:101) X = ( x T − , ) and is vector of all ones. Since φ min (2) = 0 if and only if x T − = a for some a ∈ R , φ min (2) will be of constant order with overwhelming probability. Since k can be chosenarbitrarily small, κ can be lower bounded by a positive constant with high probability. Since (cid:107) η (cid:107) ∞ = O (cid:0) (log T ) / /T / (cid:1) , for large enough T , we can simplify Proposition 1 into (cid:107) e I (cid:107) ≤ (cid:101) C √ s max { sδ , δ } , lgorithm 1 Hyperparameter tuning procedure: a Golden-section search variant.

Input:

Observations x , . . . , x T , given history x − p +1 , . . . , x , a pre-speciﬁed interval [ δ (cid:96) , δ u ] to search the best δ and tolerance ε > . Output: ε -optimal hyperparameter δ . Determine two intermediate points δ = δ (cid:96) + d and δ = δ u − d , where d = √ − ( δ u − δ (cid:96) ) For k = 1 , : ﬁt the proposed estimator as deﬁned by (2) with hyperparameters δ k ;construct the residual sequence by r ( k ) i = x i − (cid:80) pj =1 (cid:98) α j ( k ) x i − j − (cid:98) f i ( k ) , i = 1 , . . . , T ;apply (Lag-p) Ljung-Box test to the residual sequence { r i ( k ) } Ti =1 to obtain a p -value p k = f ( δ k ) .If f ( δ ) > f ( δ ) , update δ (cid:96) , δ , δ , δ u as follows δ (cid:96) = δ , δ = δ , δ u = δ u , δ = δ + √ − ( δ u − δ (cid:96) ) ; Otherwise, update δ (cid:96) , δ , δ , δ u as follows δ (cid:96) = δ (cid:96) , δ u = δ , δ = δ , δ = δ u − √ − ( δ u − δ (cid:96) ) . If δ u − δ (cid:96) < ε , set δ max = ( δ u + δ (cid:96) ) / and stop iterating; otherwise, go back to step .where (cid:101) C > is a constant. Together with the naive upper bound by triangle inequality, weobtain (cid:107) e I (cid:107) ≤ min (cid:110) (cid:101) C √ s max { sδ , δ } , (cid:112) vol( S ) /π (cid:111) . Since (cid:107) (cid:98) ∆ (cid:107) ≤ (cid:107) (cid:98) ∆ (cid:107) ≤ δ and (cid:107) ∆ (cid:107) ≤ √ sδ , by triangle inequality, we have (cid:107) e I ∪ I (cid:107) = (cid:107) (cid:98) ∆ − ∆ (cid:107) ≤ (cid:107) (cid:98) ∆ (cid:107) + (cid:107) ∆ (cid:107) ≤ δ + √ sδ . Again, by triangle inequality, (cid:107) (cid:98) β T − β (cid:107) ≤ (cid:107) e I (cid:107) + (cid:107) e I ∪ I (cid:107) . We complete the proof.The proof of Proposition 1 is highly involved. We sketch its proof as follows: Proof of Proposition 1.

We ﬁrst state four very useful lemmas.

Lemma 1 (High probability bounds for sub-Gaussian noise) . For sub-Gaussian randomnoise ε , . . . , ε T i.i.d. ∼ subG( σ ) and x , . . . , x T generated by (5) (given x ), for all A > , A > √ A and A > , deﬁne events A = (cid:26) | ε i | ≤ (cid:113) A σ log(2 T ) , i = 1 , . . . , T (cid:27) , and A = (cid:40)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i = j ε i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ A σ √ T log(2 T ) , j = 1 , . . . , T (cid:41) ,

25e have pr ( A ) ≥ − (2 T ) − A , pr ( A |A ) ≥ − (2 T ) − A /A . Furthermore, if we assume there exists a constant C > such that | f i | ≤ C (cid:112) log T , i = 1 , . . . , T, (17)deﬁne event A = (cid:40)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i =1 ε i x i − (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ √ A σ ( c + 1) log(2 T ) (cid:112) T log(2 T ) / (1 − α ) (cid:41) , where c > is a constant such that | f i | ≤ c (cid:112) A σ log(2 T ) , i = 1 , . . . , T , then we will have pr ( A |A ) ≥ − T ) − A /A . By Lemma 1, we will have pr( A ∩ A ∩ A ) = pr( A ) (1 − pr( A c ∪ A c |A )) > − (2 T ) − A − (2 T ) − A /A − T ) − A /A . This means event A = A ∩A ∩A holds with probability at least − (2 T ) − A − (2 T ) − A /A − T ) − A /A . Lemma 2 (Restricted (cid:96) estimation error) . Under assumption (17), for our proposed esti-mator (cid:98) β T , as deﬁned in (7) or equivalently (13), if we choose k ∈ (0 , and tuning parameter λ such that kλ = O (cid:0) (log T ) / /T / (cid:1) , then on event A , the estimation error e = (cid:98) β T − β satisﬁes: (cid:107) e I (cid:107) ≤ min (cid:26) k − k (cid:107) e I (cid:107) , sδ − k (cid:27) + k − k (cid:107) e I (cid:107) . Lemma 3.

Under assumption (17), on event A , for any integer m ≤ T + 1 − | J | ,we have √ T (cid:107) X e (cid:107) ≥ (cid:18)(cid:112) φ min (2) − (cid:112) φ max ( m ) k − k (cid:19) (cid:107) e I (cid:107) − (cid:32) (cid:112) φ max ( m )1 − k + (cid:115) ( s − (cid:18) − s − T (cid:19)(cid:33) sδ − (cid:115) ( s − (cid:18) − s − T (cid:19) δ, (18)where φ min ( · ) and φ max ( · ) are deﬁned in (14).For simplicity, in the following we denote κ = (cid:112) φ min (2) − k (cid:112) φ max ( m ) / (1 − k ) and C ( δ, δ , s, m ) = 2 (cid:112) φ max ( m ) sδ − k + (cid:115) ( s − (cid:18) − s − T (cid:19) ( sδ + δ ) . We further denote J = I ∪ I , i.e. the set of indices for all non-zero coeﬃcients.26 emma 4. Under assumption (17), on event A , we have (cid:107) X e (cid:107) /T ≤ (cid:107) η (cid:107) ∞ (cid:107) e J (cid:107) / (1 − k ) , (19)Additionally, we have (cid:107) η (cid:107) ∞ = O (cid:0) (log T ) / /T / (cid:1) .Here, we consider two cases: (i) (cid:107) e I (cid:107) ≤ (cid:107) e I (cid:107) and (ii) (cid:107) e I (cid:107) > (cid:107) e I (cid:107) . In case (i), wehave (cid:107) e I (cid:107) ≤ (cid:107) e I (cid:107) ≤ (cid:107) e I (cid:107) = (cid:107) (cid:98) ∆ I − ∆ I (cid:107) ≤ (cid:107) (cid:98) ∆ I (cid:107) + (cid:107) ∆ (cid:107) ≤ δ + sδ . In case (ii), (cid:107) e J (cid:107) = (cid:107) e I (cid:107) + (cid:107) e I (cid:107) < (cid:107) e I (cid:107) . By (18) and (19), we have − k (cid:107) η (cid:107) ∞ (cid:107) e J (cid:107) ≥ ( κ (cid:107) e I (cid:107) − C ( δ, δ , s, m )) ≥ κ (cid:107) e I (cid:107) (cid:107) e I (cid:107) / √ − κC ( δ, δ , s, m ) (cid:107) e I (cid:107) ≥ κ (cid:107) e I (cid:107) (cid:107) e J (cid:107) / √ − κC ( δ, δ , s, m ) (cid:107) e J (cid:107) . Rearranging the terms in the above inequality and choosing m = 1 , we have (cid:107) e I (cid:107) ≤ √ κ (cid:18) (cid:107) η (cid:107) ∞ − k + κC ( δ, δ , s, (cid:19) . Denote C δ,δ ,s = C ( δ, δ , s, . Since φ max (1) = 1 − /T , combing the results above proves(15). Proofs of Lemmas in the proof of Proposition 1

Proof of Lemma 1.

For sub-Gaussian random noise ε i ∼ subG ( σ ) , we will have: pr( | ε i | ≤ c , i = 1 , . . . , T ) ≥ − T exp (cid:26) − c σ (cid:27) . Setting c = (cid:112) A σ log(2 T ) , we prove the ﬁrst inequality.By the uniform upper bound on the dynamic background (17), we can ﬁnd a constant c such that dynamic background is uniformly bounded by c c . Thus, on event A we will get − ( c + 1) c ≤ x i − α x i − ≤ ( c + 1) c , ( i = 1 , . . . , T ) . By the convergence of geometric series we have | x i | ≤ ( c + 1) c / (1 − α ) and thus wehave | x i − ε i | ≤ ( c + 1) c − α = c , ( i = 1 , . . . , T ) . Since E [ x i − ε i | x i − ] = x i − E [ ε i ] = 0 and Var( x i − ε i | x i − ) = x i − σ ≤ (cid:18) ( c + 1) c − α (cid:19) σ , { x i − ε i } is a bounded martingale diﬀerence sequence with respect to ﬁltration { σ ( x , . . . , x i − ) } .27y Azuma–Hoeﬀding inequality, we have pr (cid:32) T (cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i =1 x i − ε i (cid:12)(cid:12)(cid:12)(cid:12) ≥ c (cid:33) ≤ (cid:26) − T c c (cid:27) . Set c = A √ σ ( c + 1)1 − α log(2 T ) (cid:114) log(2 T ) T , (20)we prove the third inequality.Similarly, on event A , by Azuma–Hoeﬀding inequality, we will obtain pr (cid:32) T (cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i = j ε i (cid:12)(cid:12)(cid:12)(cid:12) ≥ c (cid:33) ≤ (cid:26) − T c T − j ) c (cid:27) < (cid:26) − T c A σ log(2 T ) (cid:27) , j = 1 , . . . , T. Therefore, pr (cid:32) T (cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i = j ε i (cid:12)(cid:12)(cid:12)(cid:12) < c , j = 1 , . . . , T (cid:33) ≥ − T (cid:88) j =2 pr (cid:32) T (cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i = j ε i (cid:12)(cid:12)(cid:12)(cid:12) ≥ c (cid:33) > − T exp (cid:26) − T c A σ log(2 T ) (cid:27) , where the ﬁrst inequality comes from union bound. Again, set c = 2 A σ log(2 T ) √ T , (21)we prove the second inequality.

Proof of Lemma 2.

By deﬁnition (13), we have T (cid:107) x T − X (cid:98) β T (cid:107) + λ (cid:107) (cid:98) ∆ (cid:107) ≤ T (cid:107) x T − X β (cid:107) + λ (cid:107) ∆ (cid:107) . Rearrange terms and we will get T (cid:107) X ( (cid:98) β T − β ) (cid:107) ≤ λ ( (cid:107) ∆ (cid:107) − (cid:107) (cid:98) ∆ (cid:107) ) + 1 T ε T T X ( (cid:98) β T − β ) . If we choose kλ as follows kλ = 2 √ A σ ( c + 1)1 − α log(2 T ) (cid:114) log(2 T ) T = O (cid:18) (log T ) / T / (cid:19) , we have kλ = c > c for T large enough, where c and c are deﬁned in (20) and (21),respectively. Then on event A , we have T | ε T T X | = 1 T (cid:32)(cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i =1 ε i x i − (cid:12)(cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i =1 ε i (cid:12)(cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12)(cid:12) T (cid:88) i =2 ε i (cid:12)(cid:12)(cid:12)(cid:12) , . . . , | ε T | (cid:33) T ≤ kλ , (22)28here ∈ R T is the vector of ones. Thus, we will obtain (cid:107) η (cid:107) ∞ = (cid:107) ε T T X /T (cid:107) ∞ ≤ kλ and T (cid:107) X ( (cid:98) β T − β ) (cid:107) ≤ λ ( (cid:107) ∆ (cid:107) − (cid:107) (cid:98) ∆ (cid:107) ) + kλ (cid:107) (cid:98) β T − β (cid:107) . By pulsing λ (1 − k ) (cid:107) e I ∪ I (cid:107) on both side of this equation, we will get (1 − k ) (cid:107) e I ∪ I (cid:107) ≤ ( (cid:107) ∆ (cid:107) − (cid:107) (cid:98) ∆ (cid:107) + (cid:107) e I ∪ I (cid:107) ) + k (cid:107) e I (cid:107) . (23)Since ∆ = β I ∪ I , e I ∪ I = (cid:98) ∆ − ∆ . By the sparse structure we know (cid:107) ∆ (cid:107) − (cid:107) (cid:98) ∆ (cid:107) + (cid:107) e I ∪ I (cid:107) ≤ (cid:107) ∆ (cid:107) ≤ sδ . (24)Meanwhile, since (cid:107) ∆ (cid:107) takes value zero on index set I , we have (cid:98) ∆ I = e I and thus (cid:107) (cid:98) ∆ I (cid:107) = (cid:107) e I (cid:107) . Therefore, we have (cid:107) ∆ (cid:107) − (cid:107) (cid:98) ∆ (cid:107) + (cid:107) e I ∪ I (cid:107) = (cid:107) ∆ I (cid:107) − (cid:107) (cid:98) ∆ I (cid:107) + (cid:107) e I (cid:107) ≤ (cid:107) e I (cid:107) . (25)Plugging (24) and (25) back into (23), we will get (cid:107) e I (cid:107) ≤ min (cid:26) k − k (cid:107) e I (cid:107) , sδ − k (cid:27) + k (cid:107) e I (cid:107) . We complete the proof.

Proof of Lemma 3.

By (7), we have (cid:107) e I (cid:107) ≤ (cid:107) e I (cid:107) = (cid:107) (cid:98) ∆ I − ∆ I (cid:107) ≤ (cid:107) (cid:98) ∆ I (cid:107) + (cid:107) ∆ (cid:107) ≤ δ + sδ . Partition index set J c into L disjoint sets: J c = ∪ L(cid:96) =1 J (cid:96) , where | J | = · · · = | J L − | = m and | J L | ≤ m , and (cid:80) L(cid:96) =1 (cid:107) e J (cid:96) (cid:107) ≤ (cid:80) L(cid:96) =1 (cid:107) e J (cid:96) (cid:107) = (cid:107) e J c (cid:107) , we get √ T (cid:107) X e (cid:107) ≥ √ T (cid:107) X e J (cid:107) − √ T (cid:107) X e J c (cid:107) ≥ (cid:112) φ min (2) (cid:107) e I (cid:107) − (cid:112) φ max ( s − (cid:107) e I (cid:107) − (cid:112) φ max ( m ) L (cid:88) (cid:96) =1 (cid:107) e J (cid:96) (cid:107) ≥ (cid:112) φ min (2) (cid:107) e I (cid:107) − (cid:115) ( s − (cid:18) − s − T (cid:19) ( δ + sδ ) − (cid:112) φ max ( m ) (cid:107) e J c (cid:107) . Since I = J c , √ (cid:107) e I (cid:107) ≥ (cid:107) e I (cid:107) , by Lemma 2, we have √ T (cid:107) X e (cid:107) ≥ (cid:32)(cid:112) φ min (2) − (cid:112) φ max ( m ) √ k − k (cid:33) (cid:107) e I (cid:107) − (cid:32) (cid:112) φ max ( m )1 − k + (cid:115) ( s − (cid:18) − s − T (cid:19)(cid:33) sδ − (cid:115) ( s − (cid:18) − s − T (cid:19) δ. Denote κ = (cid:112) φ min (2) − (cid:112) φ max ( m ) √ k − k and we complete the proof.29 roof of Lemma 4. Since (cid:98) β T is solution to VI[ F x T , X ] , the weak VI, and the vector ﬁeld F x T ( · ) is continuous, we have (cid:98) β T is also solution to the strong VI. That is, (cid:98) β T also satisﬁes (cid:104) F x T ( (cid:98) β T ) , w − (cid:98) β T (cid:105) ≥ , ∀ w ∈ X . In particular, we have (cid:104) F x T ( (cid:98) β T ) , β − (cid:98) β T (cid:105) ≥ . Meanwhile, we have F x T ( (cid:98) β T ) = F x T ( β ) − A [ x T ]( β − (cid:98) β T ) /T. Therefore, we will have (cid:28) F x T ( β ) − T A [ x T ]( β − (cid:98) β T ) , β − (cid:98) β T (cid:29) ≥ . Rearrange terms and recall that η = F x T ( β ) , we will get ( β − (cid:98) β T ) T ( A [ x T ] /T )( β − (cid:98) β T ) ≤ (cid:104) η, β − (cid:98) β T (cid:105) ≤ (cid:107) η (cid:107) ∞ (cid:107) β − (cid:98) β T (cid:107) , (26)where the last inequality comes from Hölder’s inequality.Notice that A [ x T ] = X T X , we can re-express the inequality above as √ T (cid:107) X e (cid:107) ≤ (cid:107) η (cid:107) ∞ (cid:107) e (cid:107) = (cid:107) η (cid:107) ∞ ( (cid:107) e J (cid:107) + (cid:107) e J c (cid:107) ) ≤ − k (cid:107) η (cid:107) ∞ (cid:107) e J (cid:107) , where the last inequality comes from Lemma 2.By (22) and the choice of kλ , we get (cid:107) η (cid:107) ∞ ≤ √ A σ ( c + 1)1 − α log(2 T ) (cid:114) log(2 T ) T = O (cid:0) (log T ) / /T / (cid:1) . We complete the proof.

C.2 Proof of Theorem 2

Proof of Theorem 2.

By Proposition 2, to make (cid:96) error lower bounded by C with probabilitygreater than − C , we need C = 12 exp (cid:40) − C T + C δ ( T ) (cid:80) Tt =2 s ( t ) + C δ ( T ) (cid:80) Tt =2 s ( t ) + log 2 C s ( T ) (cid:41) . (27)Since s ( t ) ≤ t , we will have a decreasing (w.r.t t ) lower bound at approximately exponentialrate. Thus, without any condition, the naive bound will be tighter compared to the onewe just derive if (cid:112) s ( t ) δ ( t ) goes to inﬁnity. To make sure the lower bound C we derive in(27) is of constant order, we need s ( t ) at least of order t , i.e. condition (9). However, thismakes (cid:80) Tt =2 s ( t ) = Θ( T ) and we further need δ ( t ) small enough when t ∈ { , . . . , T } , i.e.condition (10). Proof of Proposition 2.

First, we ﬁnd a large enough ε -packing by the following Lemma.30 emma 5. Let ( V, (cid:107) · (cid:107) ) be a normed space. For Θ ⊂ V ⊂ R d , we have (cid:18) ε (cid:19) d vol(Θ)vol( B ) ≤ N (Θ , (cid:107) · (cid:107) , ε ) ≤ (cid:18) ε (cid:19) d vol(Θ)vol( B ) , where B is the unit norm ball and N (Θ , (cid:107) · (cid:107) , ε ) = max { m : ∃ ε -packing of Θ of size m } is the packing number.Recall that the coeﬃcient vector space is Θ T = { β : ( α , µ ) ∈ S , ∆ ∈ B} . Since δ s isconstant, Θ T will have a constant order volume, even though δ can be very small. Thus, byLemma 5 we can ﬁnd an ε -packing N = { β , . . . , β N } ⊂ Θ T such that N ≥ C (cid:18) ε (cid:19) s , (28)where C is some positive constant. Lemma 6.

For any ε -packing N = { β , . . . , β N } ⊂ Θ T , if the random noise is normallydistributed, then the upper bound on KL divergence between the joint distributions of x T generated by (5) with coeﬃcient chosen from N is max i,j ∈ [ N ] KL ( p β i || p β j ) ≤ C T + C δ ( T ) T (cid:88) t =2 s ( t ) + C δ ( T ) T (cid:88) t =2 s ( t ) , where p β is joint probability density function (p.d.f.) of x T generated by (5) with coeﬃcient β and C , C and C are some positive constants dependent on δ s . Lemma 7 (Fano’s inequality) . Let P = { P , . . . , P N } . For any random variable Z takingvalues in [ N ] , we have N N (cid:88) i =1 P i ( Z (cid:54) = i ) ≥ − N (cid:80) i,j ∈ [ N ] KL ( P i || P j ) + log 2log N , (29)where KL ( ·||· ) is the Kullback–Leibler (KL) divergenceBy Fano’s inequality (29), we have for any r.v. Z N N (cid:88) i =1 pr β i ( Z (cid:54) = i ) ≥ − max i,j ∈ [ N ] KL ( p i || p j ) + log 2log N . (30)For any estimator (cid:101) β T , deﬁne (cid:98) ψ = ψ ( (cid:101) β T ) = argmin i ∈ [ N ] (cid:107) (cid:101) β T − β i (cid:107) , (31)31hich is the index for the element closest to (cid:101) β T (in (cid:96) norm sense) in the ε -packing N .Therefore, for any (cid:98) ψ (cid:54) = i , we have (cid:107) (cid:101) β T − β i (cid:107) ≥ (cid:107) β (cid:98) ψ − β i (cid:107) − (cid:107) (cid:101) β T − β (cid:98) ψ (cid:107) ≥ (cid:107) β (cid:98) ψ − β i (cid:107) − (cid:107) (cid:101) β T − β i (cid:107) , where the last inequality comes from (31).Re-arrange terms in the inequality above and we will have (cid:107) (cid:101) β T − β i (cid:107) ≥ (cid:107) β (cid:98) ψ − β i (cid:107) ≥ ε , where the last inequality comes from the deﬁnition of ε -packing, i.e. min i (cid:54) = j (cid:107) β i − β j (cid:107) > ε. This means when β = β i , event { (cid:98) ψ (cid:54) = i } is subset of event {(cid:107) (cid:101) β T − β i (cid:107) ≥ ε/ } . Therefore,we have sup β ∈ Θ T pr β (cid:16) (cid:107) (cid:101) β T − β (cid:107) ≥ ε/ (cid:17) ≥ sup β ∈N pr β (cid:16) (cid:107) (cid:101) β T − β (cid:107) ≥ ε/ (cid:17) ≥ max i ∈ [ N ] pr β i (cid:16) (cid:98) ψ (cid:54) = i (cid:17) ≥ N N (cid:88) i =1 pr β i (cid:16) (cid:98) ψ (cid:54) = i (cid:17) . Taking Z = (cid:98) ψ and ε = 2 C in (12), by (28) and (30), we complete the proof. Proof of Lemma 6.

For x T generated by (5) with β = ( α, µ, ∆ , . . . , ∆ T ) T , we can derivethat x t = 1 − α t − α µ + t (cid:88) i =2 − α t +1 − i − α ∆ i + t (cid:88) i =1 α t − i ε i , t = 1 , . . . , T, (32)where for t = 1 the second term is zero. We further denote τ ( t ) = 1 − α t − α µ + t (cid:88) i =2 − α t +1 − i − α ∆ i , and B ( t ) = t (cid:88) i =1 α t − i ε i . Therefore, if the random noise in (5) is Gaussian, then the joint distribution for x T willbe N ( τ, Σ) , where τ = ( τ (1) , τ (2) , . . . , τ ( T ) ) T , Σ = P α P T α and P α =  α α α α α α ... ... . . . α T − α T − . . . . . . . . . α  .

32y some simple algebra, we will obtain det(Σ) = det( P α P T α ) = det( P α ) = 1 and Σ − =  α + 1 − α − α α + 1 − α − α α + 1 − α . . . . . . . . . − α α + 1 − α − α  . Arbitrarily choose two distinct coeﬃcients from N . Without loss of generality, we denotethem to be β i ( = 1 , ). Given x , denote the joint p.d.f. of x T generated by (5) withcoeﬃcient β by p ( x T | x ; β ) . For simplicity, we denote p ( x T | x ; β i ) = p i for β i ∈ N , i = 1 , . . . , N .By the derivation above, p i is joint p.d.f. of N ( τ i , Σ i ) . Then the KL divergence betweenthese two ( T − − dimensional multivariate Gaussian distributions is KL ( N ( τ , Σ ) || N ( τ , Σ )) = (cid:90) log p ( x ) p ( x ) p ( x ) dx = (cid:90) (cid:20) log | Σ || Σ | − ( x − τ ) T Σ − ( x − τ ) + ( x − τ ) T Σ − ( x − τ ) (cid:21) p ( x ) dx = 12 log | Σ || Σ | −

12 tr (cid:8) E (cid:2) ( x − τ ) ( x − τ ) T (cid:3) Σ − (cid:9) + 12 E (cid:2) ( x − τ ) T Σ − ( x − τ ) (cid:3) = 12 log | Σ || Σ | −

12 tr { I T } + 12 ( τ − τ ) T Σ − ( τ − τ ) + 12 tr (cid:8) Σ − Σ (cid:9) = 12 (cid:20) log | Σ || Σ | − T + tr (cid:8) Σ − Σ (cid:9) + ( τ − τ ) T Σ − ( τ − τ ) (cid:21) . (33)Since det(Σ ) = det(Σ ) = 1 , we have KL ( p || p ) = 12 (cid:2) tr (cid:8) Σ − Σ (cid:9) − T + ( τ − τ ) T Σ − ( τ − τ ) (cid:3) . (34)On one hand, by the explicit form of Σ as well as Σ , we can derive that the explicitform of the diagnoal elements of Σ − Σ . For i = 2 , . . . , T − , we have (cid:0) Σ − Σ (cid:1) i,i = T − (cid:88) k =1 (cid:0) Σ − (cid:1) k,i (Σ ) k,i = − α (cid:16) (Σ ) i − ,i + (Σ ) i +1 ,i (cid:17) + ( α + 1) (Σ ) i,i = − α α (cid:18) − α i − − α + α i − (cid:19) + ( α + 1) 1 − α i − α ≤ α + 2 | α α | + 11 − α ≤ δ s + 11 − δ s . Similarly, we can derive the expression for (cid:0) Σ − Σ (cid:1) , and (cid:0) Σ − Σ (cid:1) T,T and upper boundthem by some constant. This means all diagonal elements are bounded uniformly by a33onstant. Thus, we have tr (cid:8) Σ − Σ (cid:9) − T ≤ ( c − T, (35)where constant c is the uniform upper bound the constant and we can show c > .On the other hand, for i = 1 , , we have | τ ( t ) i | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − α t − i − α i µ i + t (cid:88) j =2 − α t +1 − ji − α i ∆ i,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ δ s + s ( t ) δ − δ s . Therefore, we have | τ − τ | ≤ − δ s ( δ s + s (1) δ , . . . , δ s + s ( T ) δ ) T = 21 − δ s ( δ s + δ s T ) , where ∈ R T is the vector of ones, s T = ( s (1) , . . . , s ( T )) T and the inequality is pointwise.Denote (cid:101) Σ − =  α + 1 | α || α | α + 1 | α || α | α + 1 | α | . . . . . . . . . | α | α + 1 | α || α |  , we can get (cid:12)(cid:12) ( τ − τ ) T Σ − ( τ − τ ) (cid:12)(cid:12) ≤ | τ − τ | T (cid:101) Σ − | τ − τ | = (cid:18) − δ s (cid:19) (cid:16) δ s T (cid:101) Σ − + 2 δ s δ T (cid:101) Σ − s T + δ s T T (cid:101) Σ − s T (cid:17) . We can upper bound the last three terms above as follows (notice that s (1) = 0 ) : T (cid:101) Σ − ≤ (1 + | α | ) T ; T (cid:101) Σ − s T ≤ (1 + | α | ) ( s (2) + · · · + s ( T )); s T T (cid:101) Σ − s T = T − (cid:88) i =2 (cid:0) | α | ( s ( i ) + s ( i + 2)) + ( α + 1) s ( i + 1) (cid:1) s ( i + 1)+ ( α + 1) s (2) + | α | ( s (2) s (3) + s ( T − s ( T )) + s ( T ) . By (34), (35) and last four inequalities, we prove Lemma 6.

D Additional experiments

D.1 Numerical simulation

We set α = 0 . , σ = 0 . , δ = 0 . and choose s ∈ { , , } . For each s , we plot (cid:98) α and δ selected by Ljung-Box test with respect to time T . The result is in Figure 10.34

500 1000 1500 2000 2500 3000 3500 40000.050.10.150.20.25 = 0.1, s = 24.864, = 0.1, s = 10000 500 1000 1500 2000 2500 3000 3500 4000-12-10-8-6-40 500 1000 1500 2000 2500 3000 3500 4000-2-1012 = 0.1, s = 4.824, = 0.1, s = 2000 500 1000 1500 2000 2500 3000 3500 4000-25-20-15-10-500 500 1000 1500 2000 2500 3000 3500 4000-2-1012 = 0.1, s = 0.60474, = 0.1, s = 200 500 1000 1500 2000 2500 3000 3500 4000-20-15-10-500 500 1000 1500 2000 2500 3000 3500 4000-2-1.5-1-0.500.51 s = 200 , = 0 . , || || = 4 . AAACJXicbZDLSgMxFIYzXmu9VV26CRbBRRkytWAXCgVduKxgL9AZhkwmbUMzF5KMUGb6Mm58FTcuLCK48lXMtF1o64GQn++cQ/L/XsyZVAh9GWvrG5tb24Wd4u7e/sFh6ei4LaNEENoiEY9E18OSchbSlmKK024sKA48Tjve6Dbvd56okCwKH9U4pk6AByHrM4KVRm7p2vYi7stxoK9UwhtYRagCbWj7lCvsIk2QaVVsmGX2XY6yzLU0rJn16sQtlZGJZgVXhbUQZbCoplua2n5EkoCGinAsZc9CsXJSLBQjnE6KdiJpjMkID2hPyxAHVDrpzOUEnmviw34k9AkVnNHfGykOZG5ETwZYDeVyL4f/9XqJ6tedlIVxomhI5g/1Ew5VBPPIoM8EJYqPtcBEMP1XSIZYYKJ0sEUdgrVseVW0q6Z1aVYfauUGWsRRAKfgDFwAC1yBBrgHTdACBDyDV/AOpsaL8WZ8GJ/z0TVjsXMC/pTx/QOVq6GC s = 1000 , = 0 . , || || = 24 . AAACJnicbZDLSgMxFIYzXmu9VV26CRbBRRkytaAuCgVduKxgL9AZhkwmbUMzF5KMUGb6NG58FTcuKiLufBQzbRfaeiDk5zvnkPy/F3MmFUJfxtr6xubWdmGnuLu3f3BYOjpuyygRhLZIxCPR9bCknIW0pZjitBsLigOP0443us37nScqJIvCRzWOqRPgQcj6jGClkVuq217EfTkO9JVKWIcWQqgCbWj7lCvsIo2QaVVsmGX2XY6yzLU0rNbMm4lbKiMTzQquCmshymBRTbc0tf2IJAENFeFYyp6FYuWkWChGOJ0U7UTSGJMRHtCeliEOqHTSmc0JPNfEh/1I6BMqOKO/N1IcyNyJngywGsrlXg7/6/US1b92UhbGiaIhmT/UTzhUEcwzgz4TlCg+1gITwfRfIRligYnSyRZ1CNay5VXRrprWpVl9qJUbaBFHAZyCM3ABLHAFGuAeNEELEPAMXsEUvBsvxpvxYXzOR9eMxc4J+FPG9w8SBKG8 s = 20 , = 0 . , || || = 0 . AAACJXicbZDLSgMxFIYzXmu9VV26CRbBRSmZel0oFHThsoK9QGcYMpm0Dc1cSDJCmfZl3PgqblxYRHDlq5iZzkJbD4T8fOcckv93I86kQujLWFpeWV1bL2wUN7e2d3ZLe/stGcaC0CYJeSg6LpaUs4A2FVOcdiJBse9y2naHt2m//USFZGHwqEYRtX3cD1iPEaw0ckrXlhtyT458fSUS3sAaqkALWh7lCjtIA1Q1KxYcj627FI3HjpnBC3Q+cUplVEVZwUVh5qIM8mo4panlhST2aaAIx1J2TRQpO8FCMcLppGjFkkaYDHGfdrUMsE+lnWQuJ/BYEw/2QqFPoGBGf28k2JepET3pYzWQ870U/tfrxqp3ZScsiGJFAzJ7qBdzqEKYRgY9JihRfKQFJoLpv0IywAITpYMt6hDMecuLolWrmqfV2sNZuY7yOArgEByBE2CCS1AH96ABmoCAZ/AK3sHUeDHejA/jcza6ZOQ7B+BPGd8/lQahfw== T T TT T TT T T ˆ ↵ AAACA3icbVDLSsNAFJ34rPUVdaebwSK4KkkVdFlw47KCfUATws1k0gydPJiZCCUU3Pgrblwo4tafcOffOGmz0NYDwxzOuZd77/EzzqSyrG9jZXVtfWOztlXf3tnd2zcPDnsyzQWhXZLyVAx8kJSzhHYVU5wOMkEh9jnt++Ob0u8/UCFZmtyrSUbdGEYJCxkBpSXPPHb8lAdyEuuvcCJQ2AGeReDZU89sWE1rBrxM7Io0UIWOZ345QUrymCaKcJByaFuZcgsQihFOp3UnlzQDMoYRHWqaQEylW8xumOIzrQQ4TIV+icIz9XdHAbEs19SVMahILnql+J83zFV47RYsyXJFEzIfFOYcqxSXgeCACUoUn2gCRDC9KyYRCCBKx1bXIdiLJy+TXqtpXzRbd5eNtlXFUUMn6BSdIxtdoTa6RR3URQQ9omf0it6MJ+PFeDc+5qUrRtVzhP7A+PwBP/mX1w== l og | ˆ ↵ ↵ | AAACFHicbVDLSsNAFJ34rPUVdelmsAiCWJIq6LLgxmUF+4CmhJvJtB06yYSZiVDSfoQbf8WNC0XcunDn3zhpK2jrgWEO59zLvfcECWdKO86XtbS8srq2Xtgobm5t7+zae/sNJVJJaJ0ILmQrAEU5i2ldM81pK5EUooDTZjC4zv3mPZWKifhODxPaiaAXsy4joI3k26deIHiohpH5Mo+LHh55fdDYA570wXfPfshojH275JSdCfAicWekhGao+fanFwqSRjTWhINSbddJdCcDqRnhdFz0UkUTIAPo0bahMURUdbLJUWN8bJQQd4U0L9Z4ov7uyCBS+d6mMgLdV/NeLv7ntVPdvepkLE5STWMyHdRNOdYC5wnhkElKNB8aAkQysysmfZBAtMmxaEJw509eJI1K2T0vV24vSlVnFkcBHaIjdIJcdImq6AbVUB0R9ICe0At6tR6tZ+vNep+WLlmzngP0B9bHN593nnY= l og AAACAnicbVDLSsNAFJ34rPUVdSVuBovgqiRV0GXBjcsK9gFNKJPJpB06yYSZG6GE4sZfceNCEbd+hTv/xkmbhbYeGOZwzr3ce0+QCq7Bcb6tldW19Y3NylZ1e2d3b98+OOxomSnK2lQKqXoB0UzwhLWBg2C9VDESB4J1g/FN4XcfmNJcJvcwSZkfk2HCI04JGGlgH3uBFKGexObLPSGH2AuZADLFA7vm1J0Z8DJxS1JDJVoD+8sLJc1ilgAVROu+66Tg50QBp4JNq16mWUromAxZ39CExEz7+eyEKT4zSogjqcxLAM/U3x05iXWxpamMCYz0oleI/3n9DKJrP+dJmgFL6HxQlAkMEhd54JArRkFMDCFUcbMrpiOiCAWTWtWE4C6evEw6jbp7UW/cXdaaThlHBZ2gU3SOXHSFmugWtVAbUfSIntErerOerBfr3fqYl65YZc8R+gPr8wd1Rpdm Figure 10: Algorithmic behavior with respect to T . The red dashed line is the ground truth α = 0 . . From the second row, we can observe that the estimation error converges to alarger value with increasing s .We have two main observations from Figure 10: (i) the estimate (cid:98) α will converge to an ε -optimal solution, but cannot converge to the ground truth and (ii) for larger (cid:107) ∆ (cid:107) , which isequivalent to larger s and δ , the estimation error after convergence will grow larger. Apartfrom this, we can see the behavior of the estimation error are similar to that of the tuningparameter δ selected by Ljung-Box test — they converge at the same time. This validates ourmain theorem on the upper bound of the estimation error (8). We also try more experimentalsettings ( s ∈ { , , } and δ ∈ { . , . } ). We obtain similar results in Figure 11. Validation for a more general ar ( p ) case. Here, we take ar (2) as an example. Weﬁx α = 0 . , σ = 0 . and δ = 0 . . We choose s ∈ { , , , } . Similarly, thedynamic background generating mechanism, estimation and parameter tuning procedure isthe same as what we did in last section. We also apply Golden-section search (tolerance ε is set to be 0.04) here. For each s , we plot the same algorithmic with respect to time T inFigure 12.We can see the results are similar to that of Figures 10 and 11. Similarly to the analysisabove, we validate our theoretical ﬁndings for ar (2) case. Comparison with polynomial variant.

Apart from piecewise constant function class,polynomial is another highly expressive function class. Xu [2008] proposed to use n thorder polynomials ( n-poly ) to approximate the unstructured dynamics in non-stationaryautoregressive time series. Then the autoregressive coeﬃcients and polynomial coeﬃcientsare estimated via ordinary least square (OLS). However, he did not give instructions onhow to choose n in practice. Here, we choose n ∈ { , , } and compare n-poly with ourproposed methods under the setting: α = 0 . , σ = 0 . , s = 2000 , δ = 0 . , (cid:107) ∆ (cid:107) = 24 . .The results are plotted in Figure 13.From the ﬁgure above, we can see that all three polynomial methods considered heredo not yield accurate estimate for ar (1) series with highly unstructured dynamics. This is35

500 1000 1500 2000 2500 3000 3500 4000-0.2-0.100.10.20.3 = 0.1, s = 74.366, = 0.1, s = 30000 500 1000 1500 2000 2500 3000 3500 4000-14-12-10-8-6-4-20 500 1000 1500 2000 2500 3000 3500 4000-2-1012 = 0.1, s = 25.0822, = 0.05, s = 20000 500 1000 1500 2000 2500 3000 3500 4000-12-10-8-6-40 500 1000 1500 2000 2500 3000 3500 4000-2-1.5-1-0.500.51 = 0.1, s = 50.3469, = 0.1, s = 20000 500 1000 1500 2000 2500 3000 3500 4000-9-8-7-6-5-4-30 500 1000 1500 2000 2500 3000 3500 4000-2-1012 = 0.1, s = 11.9259, = 0.05, s = 10000 500 1000 1500 2000 2500 3000 3500 4000-25-20-15-10-500 500 1000 1500 2000 2500 3000 3500 4000-2-1.5-1-0.500.51 l og AAACAnicbVDLSsNAFL2pr1pfUVfiZrAIXZWkCrosuHFZwT6gCWUymbZDJ5kwMxFKKG78FTcuFHHrV7jzb5y0WWjrgWEO59zLvfcECWdKO863VVpb39jcKm9Xdnb39g/sw6OOEqkktE0EF7IXYEU5i2lbM81pL5EURwGn3WByk/vdByoVE/G9nibUj/AoZkNGsDbSwD7xAsFDNY3Ml3lcjJAXUq7xDA3sqlN35kCrxC1IFQq0BvaXFwqSRjTWhGOl+q6TaD/DUjPC6azipYommEzwiPYNjXFElZ/NT5ihc6OEaCikebFGc/V3R4YjlW9pKiOsx2rZy8X/vH6qh9d+xuIk1TQmi0HDlCMtUJ4HCpmkRPOpIZhIZnZFZIwlJtqkVjEhuMsnr5JOo+5e1Bt3l9VmrYijDKdwBjVw4QqacAstaAOBR3iGV3iznqwX6936WJSWrKLnGP7A+vwBct6XXg== l og | ˆ ↵ ↵ | AAACFHicbVDLSgMxFM3UV62vUZdugkUoiGWmCrosuHFZwT6gMwx30rQNzUyGJCOUaT/Cjb/ixoUibl24829MH4K2Hgg5nHMv994TJpwp7ThfVm5ldW19I79Z2Nre2d2z9w8aSqSS0DoRXMhWCIpyFtO6ZprTViIpRCGnzXBwPfGb91QqJuI7PUyoH0EvZl1GQBspsE+9UPCOGkbmyzwuenjk9UFjD3jSh8A9+yGjMQ7solN2psDLxJ2TIpqjFtifXkeQNKKxJhyUartOov0MpGaE03HBSxVNgAygR9uGxhBR5WfTo8b4xCgd3BXSvFjjqfq7I4NITfY2lRHovlr0JuJ/XjvV3Ss/Y3GSahqT2aBuyrEWeJIQ7jBJieZDQ4BIZnbFpA8SiDY5FkwI7uLJy6RRKbvn5crtRbFamseRR0foGJWQiy5RFd2gGqojgh7QE3pBr9aj9Wy9We+z0pw17zlEf2B9fAOdD55u ˆ ↵ AAACA3icbVDLSsNAFJ3UV62vqDvdDBahq5JUQZcFNy4r2Ac0IdxMJu3QyYOZiVBCwY2/4saFIm79CXf+jZM2C209MMzhnHu59x4/5Uwqy/o2KmvrG5tb1e3azu7e/oF5eNSTSSYI7ZKEJ2Lgg6ScxbSrmOJ0kAoKkc9p35/cFH7/gQrJkvheTVPqRjCKWcgIKC155onjJzyQ00h/uTMGhR3g6Rg8e+aZdatpzYFXiV2SOirR8cwvJ0hIFtFYEQ5SDm0rVW4OQjHC6azmZJKmQCYwokNNY4iodPP5DTN8rpUAh4nQL1Z4rv7uyCGSxZq6MgI1lsteIf7nDTMVXrs5i9NM0ZgsBoUZxyrBRSA4YIISxaeaABFM74rJGAQQpWOr6RDs5ZNXSa/VtC+arbvLertRxlFFp+gMNZCNrlAb3aIO6iKCHtEzekVvxpPxYrwbH4vSilH2HKM/MD5/AD2Rl88= T T T TT T T TT T T T s = 1000 , = 0 . , || || = 11 . AAACJ3icbVBLSwMxGMzWV62vqkcvwSL0UJZsVdSDUtCDxwr2Ad1lyWbTNjT7IMkKZdt/48W/4kVQET36T8y2PWjrByHDzHwkM17MmVQIfRm5peWV1bX8emFjc2t7p7i715RRIghtkIhHou1hSTkLaUMxxWk7FhQHHqctb3Cd6a0HKiSLwns1jKkT4F7IuoxgpSm3eGV7EfflMNBXKuEltBBCFWhD26dcYRdpCpnotGLD0ci+ybjRyLUyo2VejN1iSauTgYvAmoESmE3dLb7afkSSgIaKcCxlx0KxclIsFCOcjgt2ImmMyQD3aEfDEAdUOukk5xgeacaH3UjoEyo4YX9vpDiQWRTtDLDqy3ktI//TOonqnjspC+NE0ZBMH+omHKoIZqVBnwlKFB9qgIlg+q+Q9LHAROlqC7oEaz7yImhWTevYrN6dlGrlWR15cAAOQRlY4AzUwC2ogwYg4BE8gzfwbjwZL8aH8Tm15ozZzj74M8b3D439oe4= s = 2000 , = 0 . , || || = 25 . AAACJ3icbVBLSwMxGMzWV62vqkcvwSL0UEq2WvSiFPTgsYJ9QHdZstm0Dc0+SLJC2e2/8eJf8SKoiB79J2bbHrT1g5BhZj6SGTfiTCqEvozcyura+kZ+s7C1vbO7V9w/aMswFoS2SMhD0XWxpJwFtKWY4rQbCYp9l9OOO7rO9M4DFZKFwb0aR9T28SBgfUaw0pRTvLLckHty7OsrkfAS1hBCFWhBy6NcYQdpClVRvWLBNLVuMi5NHTMz1qvmxCmWtDoduAzMOSiB+TSd4qvlhST2aaAIx1L2TBQpO8FCMcLppGDFkkaYjPCA9jQMsE+lnUxzTuCJZjzYD4U+gYJT9vdGgn2ZRdFOH6uhXNQy8j+tF6v+hZ2wIIoVDcjsoX7MoQphVhr0mKBE8bEGmAim/wrJEAtMlK62oEswFyMvg3atap5Wa3dnpUZ5XkceHIFjUAYmOAcNcAuaoAUIeATP4A28G0/Gi/FhfM6sOWO+cwj+jPH9A4sqoew= s = 2000 , = 0 . , || || = 50 . AAACJnicbZDLSgMxFIYz9VbrrerSTbAIXZQh0yq6KRR04bKCvUCnDJlM2oZmLiQZoUz7NG58FTcuKiLufBQz7Sy09UDIz3fOIfl/N+JMKoS+jNzG5tb2Tn63sLd/cHhUPD5pyzAWhLZIyEPRdbGknAW0pZjitBsJin2X0447vk37nScqJAuDRzWJaN/Hw4ANGMFKI6dYt92Qe3Li6yuRsA6rCKEKtKHtUa6wgzRCplWx4XRq36VoOnUsDa+QWZs5xRIy0aLgurAyUQJZNZ3i3PZCEvs0UIRjKXsWilQ/wUIxwumsYMeSRpiM8ZD2tAywT2U/WdicwQtNPDgIhT6Bggv6eyPBvkyd6Ekfq5Fc7aXwv14vVoObfsKCKFY0IMuHBjGHKoRpZtBjghLFJ1pgIpj+KyQjLDBROtmCDsFatbwu2lXTqpnVh8tSo5zFkQdn4ByUgQWuQQPcgyZoAQKewSuYg3fjxXgzPozP5WjOyHZOwZ8yvn8ABqqhrg== s = 3000 , = 0 . , || || = 74 . AAACJnicbZDLSgMxFIYzXmu9VV26CRahizJk2kLdFAq6cFnBXqBThkwmbUMzF5KMUKZ9Gje+ihsXFRF3PoqZdhbaeiDk5zvnkPy/G3EmFUJfxtb2zu7efu4gf3h0fHJaODvvyDAWhLZJyEPRc7GknAW0rZjitBcJin2X0647uU373ScqJAuDRzWN6MDHo4ANGcFKI6fQsN2Qe3Lq6yuRsAGrCKEytKHtUa6wgzRCplW24Wxm36VoNnMsDes1szZ3CkVkomXBTWFlogiyajmFhe2FJPZpoAjHUvYtFKlBgoVihNN53o4ljTCZ4BHtaxlgn8pBsrQ5h9eaeHAYCn0CBZf090aCfZk60ZM+VmO53kvhf71+rIY3g4QFUaxoQFYPDWMOVQjTzKDHBCWKT7XARDD9V0jGWGCidLJ5HYK1bnlTdCqmVTUrD7Vis5TFkQOX4AqUgAXqoAnuQQu0AQHP4BUswLvxYrwZH8bnanTLyHYuwJ8yvn8AEwuhtg== Figure 11: In each column: the experimental settings, rate for sparse changes, one-stepchanges’ magnitude and their total variation are listed on the top and the rest of theexperimental settings are the same with Figure 10; from top to the bottom, we plot (cid:98) α ,logarithm of (cid:96) estimation error log | (cid:98) α − α | and logarithm of tuning parameter δ selected byLjung-Box test with respect to T . In the ﬁst column, the red dashed line is the ground truth α = 0 . . We can see that with increasing s and δ , the estimation accuracy becomes lower. = 0.1, s = 75.4956, = 0.1, s = 30000 500 1000 1500 2000 2500 3000 3500 4000-7-6-5-4-30 500 1000 1500 2000 2500 3000 3500 4000-2-1012 = 0.1, s = 24.7422, = 0.1, s = 10000 500 1000 1500 2000 2500 3000 3500 4000-12-10-8-6-4-20 500 1000 1500 2000 2500 3000 3500 4000-2-1012 = 0.1, s = 50.2594, = 0.1, s = 20000 500 1000 1500 2000 2500 3000 3500 4000-6-5-4-3-20 500 1000 1500 2000 2500 3000 3500 4000-2-1012 = 0.1, s = 4.9612, = 0.1, s = 2000 500 1000 1500 2000 2500 3000 3500 4000-12-10-8-6-4-20 500 1000 1500 2000 2500 3000 3500 4000-2-1012 T T T TT T T TT T T T s = 200 , = 0 . , || || = 4 . AAACJXicbZDLSgMxFIYzXmu9jbp0EyxCF2XI1OIFFAq6cFnBXqBThkwmbUMzF5KMUKZ9GTe+ihsXFhFc+Spm2i609UDIz3fOIfl/L+ZMKoS+jJXVtfWNzdxWfntnd2/fPDhsyCgRhNZJxCPR8rCknIW0rpjitBULigOP06Y3uM36zScqJIvCRzWMaSfAvZB1GcFKI9e8dryI+3IY6CuV8AaWESpBBzo+5Qq7SBNk2SUHjkbOXYZGI9fWsGJdnY9ds4AsNC24LOy5KIB51Vxz4vgRSQIaKsKxlG0bxaqTYqEY4XScdxJJY0wGuEfbWoY4oLKTTl2O4akmPuxGQp9QwSn9vZHiQGZG9GSAVV8u9jL4X6+dqO5lJ2VhnCgaktlD3YRDFcEsMugzQYniQy0wEUz/FZI+FpgoHWxeh2AvWl4WjbJln1nlh0qhWpzHkQPH4AQUgQ0uQBXcgxqoAwKewSt4BxPjxXgzPozP2eiKMd85An/K+P4Bmt2hfw== s = 1000 , = 0 . , || || = 24 . AAACJnicbZDLSgMxFIYz9VbrrerSTbAIXZQhUwt1UyjowmUFe4FOGTKZtA3NXEgyQpn2adz4Km5cVETc+Shm2llo64GQn++cQ/L/bsSZVAh9Gbmt7Z3dvfx+4eDw6PikeHrWkWEsCG2TkIei52JJOQtoWzHFaS8SFPsup113cpv2u09USBYGj2oa0YGPRwEbMoKVRk6xYbsh9+TU11ciYQNaCKEKtKHtUa6wgzRCplWx4Wxm36VoNnMsDas1sz53iiVkomXBTWFlogSyajnFhe2FJPZpoAjHUvYtFKlBgoVihNN5wY4ljTCZ4BHtaxlgn8pBsrQ5h1eaeHAYCn0CBZf090aCfZk60ZM+VmO53kvhf71+rIY3g4QFUaxoQFYPDWMOVQjTzKDHBCWKT7XARDD9V0jGWGCidLIFHYK1bnlTdKqmdW1WH2qlZjmLIw8uwCUoAwvUQRPcgxZoAwKewStYgHfjxXgzPozP1WjOyHbOwZ8yvn8ADJKhsg== s = 2000 , = 0 . , || || = 50 . AAACJnicbZDLSgMxFIYz9VbrrerSTbAIXZQh0yq6KRR04bKCvUCnDJlM2oZmLiQZoUz7NG58FTcuKiLufBQz7Sy09UDIz3fOIfl/N+JMKoS+jNzG5tb2Tn63sLd/cHhUPD5pyzAWhLZIyEPRdbGknAW0pZjitBsJin2X0447vk37nScqJAuDRzWJaN/Hw4ANGMFKI6dYt92Qe3Li6yuRsA6rCKEKtKHtUa6wgzRCplWx4XRq36VoOnUsDa+QWZs5xRIy0aLgurAyUQJZNZ3i3PZCEvs0UIRjKXsWilQ/wUIxwumsYMeSRpiM8ZD2tAywT2U/WdicwQtNPDgIhT6Bggv6eyPBvkyd6Ekfq5Fc7aXwv14vVoObfsKCKFY0IMuHBjGHKoRpZtBjghLFJ1pgIpj+KyQjLDBROtmCDsFatbwu2lXTqpnVh8tSo5zFkQdn4ByUgQWuQQPcgyZoAQKewSuYg3fjxXgzPozP5WjOyHZOwZ8yvn8ABqqhrg== s = 3000 , = 0 . , || || = 75 . AAACJnicbZDLagIxFIYz9mbtzbbLbkKl4EKGjFbsRhDaRZcW6gUcGTKZqMHMhSRTkNGn6aav0k0XllK666M0oy5a7YGQn++cQ/L/bsSZVAh9GZmt7Z3dvex+7uDw6Pgkf3rWlmEsCG2RkIei62JJOQtoSzHFaTcSFPsupx13fJv2O09USBYGj2oS0b6PhwEbMIKVRk6+brsh9+TE11ciYR1WEEIlaEPbo1xhB2mETKtkw+nUvkvRdOpYGtaqZnXm5AvIRIuCm8JaiQJYVdPJz20vJLFPA0U4lrJnoUj1EywUI5zOcnYsaYTJGA9pT8sA+1T2k4XNGbzSxIODUOgTKLigvzcS7MvUiZ70sRrJ9V4K/+v1YjW46ScsiGJFA7J8aBBzqEKYZgY9JihRfKIFJoLpv0IywgITpZPN6RCsdcubol02rYpZfrguNIqrOLLgAlyCIrBADTTAPWiCFiDgGbyCOXg3Xow348P4XI5mjNXOOfhTxvcPFhehuA== F i tt e d A R c o e ff i c i e n t L oga r i t h m o f e s t i m a t i o n e rr o r L oga r i t h m o f s e l ec t e d b y ` AAAB/nicbVBPS8MwHP11/pvzX1U8eQkOYafRTkGPAy8eJ7hNWEtJ02wLS9OSpMIoA7+KFw+KePVzePPbmG496OaDkMd7vx95eWHKmdKO821V1tY3Nreq27Wd3b39A/vwqKeSTBLaJQlP5EOIFeVM0K5mmtOHVFIch5z2w8lN4fcfqVQsEfd6mlI/xiPBhoxgbaTAPkFemPBITWNz5R7lPGjNUGDXnaYzB1olbknqUKIT2F9elJAspkITjpUauE6q/RxLzQins5qXKZpiMsEjOjBU4JgqP5/Hn6Fzo0RomEhzhEZz9fdGjmNVBDSTMdZjtewV4n/eINPDaz9nIs00FWTx0DDjSCeo6AJFTFKi+dQQTCQzWREZY4mJNo3VTAnu8pdXSa/VdC+arbvLertR1lGFUziDBrhwBW24hQ50gUAOz/AKb9aT9WK9Wx+L0YpV7hzDH1ifP+ZflVw= AAAB/nicbVBPS8MwHP11/pvzX1U8eQkOYafRTkGPAy8eJ7hNWMtI03QLS9OSpMIoA7+KFw+KePVzePPbmG496OaDkMd7vx95eUHKmdKO821V1tY3Nreq27Wd3b39A/vwqKeSTBLaJQlP5EOAFeVM0K5mmtOHVFIcB5z2g8lN4fcfqVQsEfd6mlI/xiPBIkawNtLQPkFekPBQTWNz5V5IucYzNLTrTtOZA60StyR1KNEZ2l9emJAspkITjpUauE6q/RxLzQins5qXKZpiMsEjOjBU4JgqP5/Hn6Fzo4QoSqQ5QqO5+nsjx7EqAprJGOuxWvYK8T9vkOno2s+ZSDNNBVk8FGUc6QQVXaCQSUo0nxqCiWQmKyJjLDHRprGaKcFd/vIq6bWa7kWzdXdZbzfKOqpwCmfQABeuoA230IEuEMjhGV7hzXqyXqx362MxWrHKnWP4A+vzB0HSlZg= LB AAAB8nicbVA9SwNBEN2LXzF+RS1tFoOQKtxFQcugjYVFBPMBlyPsbfaSJXu7x+6cGI78DBsLRWz9NXb+GzfJFZr4YODx3gwz88JEcAOu++0U1tY3NreK26Wd3b39g/LhUduoVFPWokoo3Q2JYYJL1gIOgnUTzUgcCtYJxzczv/PItOFKPsAkYUFMhpJHnBKwkt8D9gQA2d31tF+uuDV3DrxKvJxUUI5mv/zVGyiaxkwCFcQY33MTCDKigVPBpqVealhC6JgMmW+pJDEzQTY/eYrPrDLAkdK2JOC5+nsiI7Exkzi0nTGBkVn2ZuJ/np9CdBVkXCYpMEkXi6JUYFB49j8ecM0oiIklhGpub8V0RDShYFMq2RC85ZdXSbte885r9fuLSqOax1FEJ+gUVZGHLlED3aImaiGKFHpGr+jNAefFeXc+Fq0FJ585Rn/gfP4Am36RYQ== Figure 12: In each column: the experimental settings, rate for sparse changes, one-stepchanges’ magnitude and their total variation, are listed on the top; from top to the bottom,we plot (cid:98) α i ( i = 1 , , logarithm of (cid:96) estimation error log (cid:112) ( (cid:98) α − α ) + ( (cid:98) α − α ) andlogarithm of tuning parameter δ selected by Ljung-Box test with respect to T . In the ﬁstcolumn, the blue and red dashed line correspond to the ground truth α = 0 . and α = 0 . ,respectively. We can see that with increasing s and δ , the estimation accuracy becomeslower, which is the same with ar (1) case. 36

500 1000 1500 2000-10123 = 0.085667, = 24.8777, p = 0.18024, = 0.000996890 500 1000 1500 2000-0.500.511.5 0 500 1000 1500 2000-10123 = 0.059923, = 24.8777, p = 0.26722, = 8.1306e-060 500 1000 1500 2000-0.500.511.5 0 500 1000 1500 2000-10123 = 8.2401e-07, = 24.8777, p = 0.26722, = 8.1306e-060 500 1000 1500 2000-0.500.511.5 0 500 1000 1500 2000-10123 = 7.1162e-25, = 24.8777, p = 0.26722, = 8.1306e-060 500 1000 1500 2000-0.500.511.52 0 500 1000 1500 2000-10123 = 6.652e-57, = 24.8777, p = 0.26722, = 8.1306e-060 500 1000 1500 2000-0.500.511.52 ` variant AAACDnicbVC7SgNBFJ31GeMramkzGAJWYTcKWgZsLCOYB2RDmJ3cTYbMzi4zd4NhyRfY+Cs2ForYWtv5N04ehSYeGOZwzr3ce0+QSGHQdb+dtfWNza3t3E5+d2//4LBwdNwwcao51HksY90KmAEpFNRRoIRWooFFgYRmMLyZ+s0RaCNidY/jBDoR6ysRCs7QSt1CyQ9i2TPjyH6ZD1J2KxPqIzxgEGZ0xLRgCifdQtEtuzPQVeItSJEsUOsWvvxezNMIFHLJjGl7boKdjGkUXMIk76cGEsaHrA9tSxWLwHSy2TkTWrJKj4axtk8hnam/OzIWmenGtjJiODDL3lT8z2unGF53MqGSFEHx+aAwlRRjOs2G9oQGjnJsCeNa2F0pHzDNONoE8zYEb/nkVdKolL2LcuXuslitLOLIkVNyRs6JR65IldySGqkTTh7JM3klb86T8+K8Ox/z0jVn0XNC/sD5/AFTwJzc TV-LSE AAACAXicbVDLSsNAFJ34rPUVdSO4CRbBjSWpgi4LIrhwUbEvaEOZTCft0MkkzNyIJcSNv+LGhSJu/Qt3/o3TNAttPXC5h3PuZeYeL+JMgW1/GwuLS8srq4W14vrG5ta2ubPbVGEsCW2QkIey7WFFORO0AQw4bUeS4sDjtOWNLid+655KxUJRh3FE3QAPBPMZwaClnrnfBfoAnp9kHSCpN09u7q7StGeW7LKdwZonTk5KKEetZ351+yGJAyqAcKxUx7EjcBMsgRFO02I3VjTCZIQHtKOpwAFVbpJdkFpHWulbfih1CbAy9fdGggOlxoGnJwMMQzXrTcT/vE4M/oWbMBHFQAWZPuTH3ILQmsRh9ZmkBPhYE0wk03+1yBBLTECHVtQhOLMnz5Nmpeycliu3Z6VqJY+jgA7QITpGDjpHVXSNaqiBCHpEz+gVvRlPxovxbnxMRxeMfGcP/YHx+QP9RJcs AAACAXicbVDLSsNAFJ3UV62vqBvBTbAIbixJK+iy4MZlBfuANpTJdNIOnUzCzI0YQtz4K25cKOLWv3Dn3zhNs9DWA5d7OOdeZu7xIs4U2Pa3UVpZXVvfKG9WtrZ3dvfM/YOOCmNJaJuEPJQ9DyvKmaBtYMBpL5IUBx6nXW96PfO791QqFoo7SCLqBngsmM8IBi0NzaMB0Afw/DTvAGnjPAp5kmVDs2rX7BzWMnEKUkUFWkPzazAKSRxQAYRjpfqOHYGbYgmMcJpVBrGiESZTPKZ9TQUOqHLT/ILMOtXKyPJDqUuAlau/N1IcKJUEnp4MMEzUojcT//P6MfhXbspEFAMVZP6QH3MLQmsWhzVikhLgiSaYSKb/apEJlpiADq2iQ3AWT14mnXrNadTqtxfVZr2Io4yO0Qk6Qw66RE10g1qojQh6RM/oFb0ZT8aL8W58zEdLRrFziP7A+PwBnSiXlQ== AAACAXicbVDLSsNAFJ3UV62vqBvBTbAIbixJVXRZcOOygn1AG8pkOmmHTiZh5kYMIW78FTcuFHHrX7jzb5ymWWjrgcs9nHMvM/d4EWcKbPvbKC0tr6yuldcrG5tb2zvm7l5bhbEktEVCHsquhxXlTNAWMOC0G0mKA4/Tjje5nvqdeyoVC8UdJBF1AzwSzGcEg5YG5kEf6AN4fpp3gPTiNAp5kmUDs2rX7BzWInEKUkUFmgPzqz8MSRxQAYRjpXqOHYGbYgmMcJpV+rGiESYTPKI9TQUOqHLT/ILMOtbK0PJDqUuAlau/N1IcKJUEnp4MMIzVvDcV//N6MfhXbspEFAMVZPaQH3MLQmsahzVkkhLgiSaYSKb/apExlpiADq2iQ3DmT14k7XrNOavVb8+rjXoRRxkdoiN0ghx0iRroBjVRCxH0iJ7RK3oznowX4934mI2WjGJnH/2B8fkDoD6Xlw== AAACAnicbVDLSsNAFJ34rPUVdSVugkVwY0mqoMuCG5cV7APaUCbTSTt0MhNmbsQQiht/xY0LRdz6Fe78G6dpFtp64HIP59zLzD1BzJkG1/22lpZXVtfWSxvlza3tnV17b7+lZaIIbRLJpeoEWFPOBG0CA047saI4CjhtB+Prqd++p0ozKe4gjakf4aFgISMYjNS3D3tAHyAIs7wDZJ57FkueTiZ9u+JW3RzOIvEKUkEFGn37qzeQJImoAMKx1l3PjcHPsAJGOJ2Ue4mmMSZjPKRdQwWOqPaz/ISJc2KUgRNKZUqAk6u/NzIcaZ1GgZmMMIz0vDcV//O6CYRXfsZEnAAVZPZQmHAHpDPNwxkwRQnw1BBMFDN/dcgIK0zApFY2IXjzJy+SVq3qnVdrtxeVeq2Io4SO0DE6RR66RHV0gxqoiQh6RM/oFb1ZT9aL9W59zEaXrGLnAP2B9fkDEHuXzQ== = 1 . , ˆ ↵ = 0 . AAACG3icbVDLSgNBEJyN7/ha9ehlMAg5SNiNSuJBELx4VDBRyC5L7+zEDJl9MNMrhCX/4cVf8eJBEU+CB//GyeOgxoJhiqpuurvCTAqNjvNllebmFxaXllfKq2vrG5v21nZbp7livMVSmarbEDSXIuEtFCj5baY4xKHkN2H/fOTf3HOlRZpc4yDjfgx3iegKBmikwK57YSojPYjNV3gRlwj0lLq1k5MD6vUAqQcy60HgGtWpOc3jxjCwK4aNQWeJOyUVMsVlYH94UcrymCfIJGjdcZ0M/QIUCib5sOzlmmfA+nDHO4YmEHPtF+PbhnTfKBHtpsq8BOlY/dlRQKxH65vKGLCn/3oj8T+vk2O36RciyXLkCZsM6uaSYkpHQdFIKM5QDgwBpoTZlbIeKGBo4iybENy/J8+Sdr3mHtbqV0eVs+o0jmWyS/ZIlbikQc7IBbkkLcLIA3kiL+TVerSerTfrfVJasqY9O+QXrM9vg8iebg== = 0 . , ˆ ↵ = 0 . AAACHXicbVDLSgMxFM34rPU16tJNsAgupMy0PhdCwY3LCrYKnaHcyaQ2NPMguSOUoT/ixl9x40IRF27EvzHTduHrQMjhnHtJzglSKTQ6zqc1Mzs3v7BYWiovr6yurdsbm22dZIrxFktkom4C0FyKmLdQoOQ3qeIQBZJfB4Pzwr++40qLJL7CYcr9CG5j0RMM0Ehd+8ALEhnqYWSu3Au5RKBn1Kk67lF9n3p9QOqBTPvQdSf64enpqGtXDBuD/iXulFTIFM2u/e6FCcsiHiOToHXHdVL0c1AomOSjspdpngIbwC3vGBpDxLWfj9ON6K5RQtpLlDkx0rH6fSOHSBcBzGQE2Ne/vUL8z+tk2DvxcxGnGfKYTR7qZZJiQouqaCgUZyiHhgBTwvyVsj4oYGgKLZsS3N+R/5J2rerWq7XLg0pjb1pHiWyTHbJHXHJMGuSCNEmLMHJPHskzebEerCfr1XqbjM5Y050t8gPWxxdxFZ7c n = 3 , ˆ ↵ = 8 . ⇥ AAACHnicbVBNSwMxEM36WetX1aOXYBE8aNltFXspFLx4rGBV6NYym6Y2mM0uyaxQlv4SL/4VLx4UETzpvzHb9qCtAyGP92aYeS+IpTDout/O3PzC4tJybiW/ura+sVnY2r4yUaIZb7JIRvomAMOlULyJAiW/iTWHMJD8Org/y/TrB66NiNQlDmLeDuFOiZ5ggJbqFE78IJJdMwjtl6pa5ZD6fUDqg4z70PFojVZL5WPqowi5oZ57mx6dDoedQtEtuaOis8CbgCKZVKNT+PS7EUtCrpBJMKbluTG2U9AomOTDvJ8YHgO7hzveslCB3dZOR/aGdN8yXdqLtH0K6Yj9PZFCaDIHtjME7JtpLSP/01oJ9qrtVKg4Qa7YeFEvkRQjmmVFu0JzhnJgATAt7K2U9UEDQ5to3obgTVueBVflklcplS+Oi/WDSRw5skv2yAHxyCmpk3PSIE3CyCN5Jq/kzXlyXpx352PcOudMZnbIn3K+fgCYYaAp n = 5 , ˆ ↵ = 7 . ⇥ AAACH3icbVBNSwMxEM3Wr1q/qh69BIvgQctutdZLQfDiUcGq0K1lNk3b0Gx2SWaFsvSfePGvePGgiHjrvzH9OGjrQMjjvRlm3gtiKQy67tDJLCwuLa9kV3Nr6xubW/ntnTsTJZrxGotkpB8CMFwKxWsoUPKHWHMIA8nvg97lSL9/4tqISN1iP+aNEDpKtAUDtFQzf+YHkWyZfmi/VFXLR9TvAlIfZNyFpkertFL0StRHEXJDPfcxPS6VB4NmvuAW3XHReeBNQYFM67qZ//ZbEUtCrpBJMKbuuTE2UtAomOSDnJ8YHgPrQYfXLVRg1zXSsb8BPbBMi7YjbZ9COmZ/T6QQmpEF2xkCds2sNiL/0+oJts8bqVBxglyxyaJ2IilGdBQWbQnNGcq+BcC0sLdS1gUNDG2kORuCN2t5HtyVit5JsXRzWrg4nMaRJXtknxwSj1TIBbki16RGGHkmr+SdfDgvzpvz6XxNWjPOdGaX/Cln+AMUfKBh n = 10 , ˆ ↵ = 6 . ⇥ AAACIHicbVBNSwMxEM36WevXqkcvwSL0oGVXbeulUPDisYJVoVvLbJraYDa7JLNCWfpTvPhXvHhQRG/6a0xrD2odCHm8N8PMe2EihUHP+3BmZufmFxZzS/nlldW1dXdj88LEqWa8yWIZ66sQDJdC8SYKlPwq0RyiUPLL8PZkpF/ecW1ErM5xkPB2BDdK9AQDtFTHrQZhLLtmENkvUzXf26NBH5AGIJM+dHxao5VSpUwDFBE31Peus/1ydTjsuAWv5I2LTgN/AgpkUo2O+x50Y5ZGXCGTYEzL9xJsZ6BRMMmH+SA1PAF2Cze8ZaECu66djQ0O6a5lurQXa/sU0jH7cyKDyIw82M4IsG/+aiPyP62VYu+4nQmVpMgV+17USyXFmI7Sol2hOUM5sACYFvZWyvqggaHNNG9D8P9angYXByX/sHRwdlSoFydx5Mg22SFF4pMqqZNT0iBNwsg9eSTP5MV5cJ6cV+ftu3XGmcxskV/lfH4Bm+2gow== Figure 13: Comparison to n-poly due to Xu [2008] with n ∈ { , , } . The correspondinghyperparameters and ar (1) estimate are on the top of each column. We can see n-poly yields a very biased (cid:98) α (even though faithfully captures the dynamics).not surprising since polynomials are less expressive compared to piecewise constant function.Obviously, n-poly will perform better when the dynamics is smoother and more structured. D.2 Detailed estimation procedure in real data experiment

Here, we take subject 23 as an example to show why we choose to use logarithm transform indetail. First, we directly apply our proposed estimator on the RT sequence with hyperparam-eter selected by Ljung-Box test, as is detailed in proposed tuning procedure. Since we donot have the ground truth, we can only access the goodness-of-ﬁt by assessing how close ourresidual sequence resembles white noise. We plot the histogram as well as the QQ-plot of theﬁtted residual sequence. These two plots are shown in the ﬁrst row in Figure 14.The histogram shows that the residuals are right-skewed — in fact this is true for nearlyall subjects. Ljung–Box test is commonly used in autoregressive integrated moving average(ARIMA) modeling, which requires Gaussian random noise assumption, and clearly thisassumption breaks in this study. Therefore, the p -value of Ljung-Box test directly applied toresidual sequence may not be a reasonable metric for the goodness-of-ﬁt, which undermines thevalidity of δ selected by Ljung-Box test. Nevertheless, testing for remaining serial correlationin the residual sequence is the ultimate goal of applying Ljung-Box test. Thus, we cantransform the residuals to more closely approximate a Gaussian distribution and then applythe Ljung-Box test on the transformed residuals to check for serial correlation.For right-skewed data, the most commonly used transforms are cube root and logarithm.We apply both transforms here. The transforms are performed by ﬁrst subtracting . × min residuals from the residual sequence (to make sure we obtain meaningful values afterlogarithm), and then applying cube root or logarithm transform to this sequence.We perform the aforementioned hyperparameter tuning procedure inn proposed tuningprocedure for original and transformed residuals. More precisely, the p -value in step 2.(ii) isobtained by applying Ljung-Box test on original, cube root and logarithm of residuals. Foreach method, we denote the selected hyperparameter δ and the maximum of p -value to be37

200 400 600 800 100000.511.522.5 -0.5 0 0.5 1 1.50100200300 -4 -2 0 2 4-1-0.500.511.5 p = 0.412890 200 400 600 800 100000.511.522.5 0.4 0.6 0.8 1 1.2050100150200250 -4 -2 0 2 40.40.60.811.2 p = 0.678720 200 400 600 800 100000.511.522.5 -2.5 -2 -1.5 -1 -0.5 0 0.5050100150200250 -4 -2 0 2 4-3-2-101 p = 0.95079

Figure 14: Experimental results of applying our proposed estimator to subject 23 withhyperparameter δ ori (top), δ cr (middle) and δ log (bottom). The ﬁrst column plots rawobservation (blue), ﬁtted ar (1) model (red) and ﬁtted dynamic background (yellow) withhyperparameter δ and estimated ar (1) coeﬃcient (cid:98) α on the top; the second and third columnplot the histogram and quantile-quantile (QQ) plot of (original, cube root of and logarithmof) residuals (with p ori , p cr , p log on the top). 38 δ ori , p ori ) , ( δ cr , p cr ) , ( δ log , p log ) , respectively. We illustrate all these three methods on subject23 by plotting the ﬁtted ar (1) model, ﬁtted dynamic background, histogram and QQ-plot ofthe residual sequence in Figure 14.Figure 14 shows that for subject 23 (i) from the ﬁrst column, the ﬁrst method clearlyunderﬁts the dynamic background; (ii) from the second column, the last histogram is muchmore symmetric and closely resembles p.d.f. of normal distribution; (iii) from the thirdcolumn, the last method has larger p -value, indicating less serial correlation remained inresidual sequence. This again shows that why we use p -value to select the hyperparameter —it is a easy-to-use metric which correctly indicates whether the dynamic background is ﬁttedproperly. Moreover, we see that the third method, i.e. using logarithm transform, is the bestfor subject 23. In fact, logarithm transform the best for almost all subjects in the sense that p log is the largest among p ori , p cr , p log . We also observe that for those subjects that p log is notthe largest, the tuning parameter δδ