Asymptotic Properties of Linear Filter for Noise Free Dynamical System
aa r X i v : . [ m a t h . O C ] J a n ASYMPTOTIC PROPERTIES OF LINEAR FILTER FOR NOISE FREE DYNAMICALSYSTEM
ANUGU SUMITH REDDY, AMIT APTE, AND SREEKAR VADLAMANI
Abstract.
It is known that Kalman-Bucy filter is stable with respect to initial conditions under the condi-tions of uniform complete controllability and uniform complete observability [4, 17]. In this paper, we provethe stability of Kalman-Bucy filter for the case of noise free dynamical system. The earlier stability resultscannot be applied for this case, as the system is not controllable at all. We further show that the optimallinear filter for certain class of non-Gaussian initial conditions is asymptotically proximal to Kalman-Bucyfilter. It is also shown that the filter corresponding to non-zero system noise in the limit of small systemnoise approaches the filter corresponding to zero system noise in the case of Gaussian initial conditions. Introduction
Since the seminal paper of Kalman and Bucy [13], Kalman-Bucy filter is extensively studied [2, 12]. Itgives the best mean square estimate of the state at a fixed time t , given the observations up to time t , whenthe dynamical system and the observation model are linear and the initial condition is Gaussian. Studyingasymptotic properties of filters with respect to initial conditions of the filter is an important aspect of filteringtheory, primarily to unravel certain universalities among different filters, and to gain some understandinginto the large time behaviour of the filters. In practice, the exact initial condition of the system is rarelyknown. Therefore, it is desirable that the filter be asymptotically independent of initial condition. Thisproperty, known as filter stability, has been studied extensively [17, 4].The classical results on stability of Kalman-Bucy filter are based on the assumption of controllability. Ifthe system being observed is modelled by a deterministic process (in other words, zero system noise in caseof additive noise systems), the assumption of controllability breaks down and the classical results are notapplicable, and a new approach is needed. This problem was first studied in [16] while for nonlinear systems,the asymptotic convergence of the filter estimate to the true state in the case of zero system noise is studiedin [8].In practice, filtering for deterministic systems is quite commonly used in the context of atmospheric andoceanic sciences where the problem is known as data assimilation. [19, 3, 10] In these applications, theasymptotic degeneracy and stability of the filter covariance (but not of the filter mean) for discrete timeKalman filter has been studied recently [11, 5] and generalising those results to filter stability for continuoustime Kalman-Bucy filters is one of the main aim of this work.The unifying theme of this work is to study stability of linear filters in the following three cases: (a)Kalman-Bucy filter in the case of zero system noise; (b) linear filter with non Gaussian initial conditions;and (c) linear filter with small system noise case and examine its relation with zero system noise case.The methods used in this work are motivated from the results of Ocone and Pardoux [17] on the sta-bility of Kalman-Bucy filter. However, analogous results, for the case of zero system noise, do not followtrivially as the crucial assumption of stabilizability becomes invalid. As mentioned earlier, Ni and Zhang[16] studied this problem and proved the stability of Dynamic Riccati Equation (see (4) below), whereaswe show (in theorems 3.4-3.6) a stronger result that the filter initialised with incorrect initial conditionconverges asymptotically to the optimal filter almost surely. We also show (in theorem 4.1) that even withnon-Gaussian initial conditions, the optimal filter approaches the Kalman-Bucy filter. It is also shown (in International Centre for Theoretical Sciences - Tata Institute of Fundamental Research, Bangalore, IndiaInternational Centre for Theoretical Sciences - Tata Institute of Fundamental Research, Bangalore, IndiaTIFR Centre for Applied Mathematics, Bangalore, India
E-mail addresses : [email protected], [email protected], [email protected] . Key words and phrases.
Kalman-Bucy; Noise free; Stability; Small noise limit. heoremthm:noisy) that, under appropriate assumptions, the case with small system noise is asymptoticallysimilar to the case with zero system noise.The paper is organised as follows: the main setup and statement of the problem is introduced in Section 2.Thereafter, in Section 3 we study the asymptotic properties of filter in the case of Gaussian initial conditions.The case of linear filter with non-Gaussian initial conditions is discussed in Section 4. In particular, weestablish that for a particular class of non-Gaussian initial conditions, the optimal filter asymptoticallyapproaches the Kalman-Bucy filter. Finally, in Section 5, it is also shown that small noise limit of the filtercorresponding to non-zero system noise is indistinguishable to the filter corresponding to zero system noisein the case of Gaussian initial conditions. 2. Problem Setup
Let (Ω , F , {F t } t ≥ , P ) be a complete filtered probability space satisfying usual conditions, i.e , F containsall P -null sets and F t is right continuous. We consider the following filtering model for a linear signal process x t ∈ R m , x t = x + Z t A s x s ds, (1)with linear observation process y t ∈ R n , y t = Z t C s x s ds + Z t R s dW s , (2)where, t ≥ A t ∈ R m × m , C t ∈ R n × m and R t ∈ R n × n . Let Y t := σ ( y s : 0 ≤ s ≤ t ) be the σ -field generatedby the observation process and W t be the m -dimensional F t -standard Brownian motion independent of x .The central theme of interest in filtering theory is estimating x t , given the observations up to time t , whichis calculating E [ x t |Y t ]. Since we are usually interested in estimating functions of x t , we are interested in thecalculating conditional distribution, π t ( B ) := E [ x t ∈ B |Y t ], where B ∈ B ( R m ).We now introduce Kalman-Bucy filtering equations which play a crucial role in the rest of the paper.These are given by, dX m,Pt = A t X m,Pt dt + P Pt C Tt R − t ( dy t − C t X m,Pt dt ) , X m,P = m ∈ R m , (3) ˙ P Pt = A t P Pt + P Pt A Tt − P Pt C Tt R − t C t P Pt , P P = P ∈ R m × m , (4)where P >
0. Note that the superscripts in X m,Pt and P Pt refers to initial conditions of (3) and (4). It iswell known [22, Theorem 9.4] that, for the filtering model in (1) and (2), if the initial condition is Gaussian, x ∼ N ( m , P ), then the conditional distribution π t is Gaussian, π t ∼ N ( ˆ X t , P t ), with mean ˆ X t := X m ,P t and covariance P t := P P t . It is clear that the Gaussian distribution ¯ π t := N ( X ¯ m, ¯ Pt , P ¯ Pt ), where X ¯ m, ¯ Pt and P ¯ Pt are solutions of (3) and (4) with initial conditions ( ¯ m, ¯ P ) different from ( m , P ), is not the same as π t . The linear filter is said to be stable with respect to initial conditions if ( π t − ¯ π t ) t →∞ −−−→ Assumption 2.1. A t , C t , R t , R − t are all continuous and uniformly bounded in t and P is invertible. In order to state the next assumption, we first need the following definition [1].
Definition 2.2.
A pair [ A t , C t ] , A t ∈ R m × m , C t ∈ R n × m is said to be uniformly completely observable , ifthere exist positive constants τ , ρ , ρ , such that for all t ≥ , we have For real symmetric positive semi-definite matrices X and Y of same dimension, we write X ≥ Y whenever x T ( X − Y ) x ≥ , ∀ x = ∈ R m . Notations like X ≤ Y , X < Y and
X > Y are adopted accordingly throughout the paper. I n ≤ Z tt − τ Φ − Tt Φ Ts C Ts R − s C s Φ s Φ − t ds ≤ ρ I n . (5)Here, Φ t is the fundamental matrix solution of (1), i.e , ˙Φ t = A t Φ t and Φ := I . Additionally, we alsoassume that: Assumption 2.3.
The pair [ A t , C t ] is uniformly completely observable. Asymptotic properties of filter in case of Gaussian initial conditions
In this section, we study the asymptotic properties and stability of the filter when the initial conditionis assumed to be Gaussian. As mentioned earlier, we are interested in calculating the distance betweenmeasures π t and ¯ π t under appropriate metric. Since both π t and ¯ π t are Gaussian, if we choose the totalvariation metric, then showing the convergence of k ˆ X t − X ¯ m, ¯ Pt k t →∞ −−−→ k P t − P ¯ Pt k t →∞ −−−→ k · k of a m × n matrix Q as k Q k := sup k x k =1 k Qx k .3.1. Stability of the dynamic Riccati equation.
We begin with observing that the solution of (4) witha non-negative definite initial condition
P ge P Pt = Φ t √ P (cid:16) I + √ P ¯ C t √ P (cid:17) − √ P Φ Tt , (6)where ¯ C t := R t Φ Ts C Ts R − s C s Φ s ds . To investigate the stability of (4), we need the following result proved in[16] which concerns the uniform boundedness of P Pt . Lemma 3.1. If [ A t , C t ] is uniformly completely observable, P Pt is uniformly bounded in t . Remark 3.2.
Consider the subspace of R m defined by S := { u : k Φ Tt u k → as t → } . For v ∈ S , itis clear from (6) that v T P Pt v → as t → ∞ (since ¯ C t is bounded below uniformly in time [16, Proposition3] ), implying that the uncertainty along S reduces to zero asymptotically in time. This feature is used indata assimilation algorithms in discrete time that go by the name of Assimilation in Unstable Subspace(AUS) [7, 18, 21] . This and other properties of the filter covariance (in discrete time) and their relation toLyapunov vectors and exponents of the dynamics that have been discussed extensively in [11, 5] extend to thefilter covariance for the Kalman-Bucy filter (in continuous time). To prove stability of (4), we consider solutions P Pt and P ¯ Pt of (4) corresponding to two different initialconditions P and ¯ P , respectively. A straightforward calculation shows that E t := P Pt − P ¯ Pt satisfies˙ E t = B Pt E t + E t (cid:0) B ¯ Pt (cid:1) T , where B Pt := (cid:0) A t − P Pt C Tt R − t C t (cid:1) and B ¯ Pt := (cid:0) A t − P ¯ Pt C Tt R − t C t (cid:1) . Further, it can easily be verified that E t = Ψ Pt (cid:0) P − ¯ P (cid:1)(cid:0) Ψ ¯ Pt (cid:1) T , (7)with ˙Ψ Pt = B Pt Ψ Pt , Ψ P = I , ˙Ψ ¯ Pt = B ¯ Pt Ψ ¯ Pt and Ψ ¯ P = I . Therefore, stability of the Riccati equation is relatedto studying the asymptotic properties of Ψ Pt and Ψ ¯ Pt . Without loss of generality, it is sufficient to studyasymptotic properties of Ψ Pt . To this end, consider a linear system˙ z t = (cid:0) A t − P Pt C Tt R − t C t (cid:1) z t , (8)whose solution is given by z t = Ψ Pt z , where z is the initial condition. The above system (8) is said to beasymptotically stable if k Ψ Pt k t →∞ −−−→ k z t k t →∞ −−−→ , ∀ z ∈ R m . Therefore, to establishthat k Ψ Pt k t →∞ −−−→
0, we use Lyapunov function approach used in [6] and show that k z t k t →∞ −−−→ , ∀ z ∈ R m .The first step towards proving asymptotic stability of (8), is the following lemma [20, Lemma 2.5.2]. emma 3.3. If [ A t , C t ] is uniformly completely observable and K t is continuous and bounded in t , then [ A t − K t C t , C t ] is also uniformly completely observable. Consequently, since P Pt C Tt R − t is continuous and bounded in t , [ B Pt , C t ] is uniformly completely observ-able, i.e., there exist ˜ τ , ρ , ρ > t > ρ I n ≤ Z tt − ˜ τ (cid:0) Ψ Pt (cid:1) − T (cid:0) Ψ Ps (cid:1) T C Ts R − s C s Ψ Ps (cid:0) Ψ Pt (cid:1) − ds ≤ ρ I n . (9)We shall now state one of the main results of this paper, that of asymptotic stability of the filter covariance. Theorem 3.4.
Let P be non-negative definite, and [ A t , C t ] be uniformly completely observable, then (8) isasymptotically stable and Z ∞ (cid:0) Ψ Ps (cid:1) T Ψ Ps ds < ˜ τ P − ρ (10) Remark 3.5.
We note here that the asymptotic stability of (8) has already been proven in [16] usingLyapunov function, wherein it is shown that k z t k t →∞ −−−→ , which in turn implies the stability of the Riccatiequation. However, our result above is stronger since (10) gives certain control over the rate of decay of Ψ Ps ,which is needed later to prove almost sure convergence of the filter mean.Proof. Like in [16], we begin with a Lyapunov function V ( z t , t ) := z Tt (cid:0) P Pt (cid:1) − z t . (11)Using (4) and (8), we see that dVdt ( z t , t ) = − z Tt ( A t − P P t C Tt R − t C t ) T (cid:0) P Pt (cid:1) − z t + z Tt ( − A Tt (cid:0) P Pt (cid:1) − − (cid:0) P Pt (cid:1) − A t + C Tt R − t C t ) z t + z Tt (cid:0) P Pt (cid:1) − ( A t − P P t C Tt R − t C t ) z t = − z Tt C Tt R − t C t z t ≤ , ∀ t > . (12)Using the relationship z s = Ψ Ps (cid:0) Ψ P (cid:1) − t z t , we can write V ( z t +˜ τ , t + ˜ τ ) − V ( z t , t ) = − z Tt Z t +˜ τt (cid:0) Ψ Pt (cid:1) − T (cid:0) Ψ Ps (cid:1) T C Ts R − s C s Ψ Ps (cid:0) Ψ Pt (cid:1) − ds z t . Observe that from (9), ρ k z t k ≤ V ( z t , t ) − V ( z t +˜ τ , t + ˜ τ ) ≤ ρ k z t k , (13)which together with the assumption of uniform complete observability of [ A t , C t ] imply that V ( z t , t ) → k z t k →
0, as t → ∞ , and that (8) is asymptotically stable.Next, in order to prove (10), observe that writing t = t ′ + k ˜ τ , for some t ′ ∈ [0 , ˜ τ ], we have V ( z t ′ +( k +1)˜ τ , t ′ + ( k + 1)˜ τ ) − V ( z t + k ˜ τ , t ′ + k ˜ τ ) ≤ − ρ k z t ′ + k ˜ τ k Adding N such inequalities with k = 0 , , , ..., N , we have V ( z t ′ +( N +1)˜ τ , t ′ + ( N + 1)˜ τ ) − V ( z t ′ , t ′ ) ≤ − ρ N X k =0 k z t ′ + k ˜ τ k Using (12), and letting N → ∞ , X k =0 k z t ′ + k ˜ τ k ≤ V ( z t ′ , t ′ ) ρ ≤ V ( z , ρ . (14)Integrating (14) with respect to t ′ in the range t ′ ∈ [0 , ˜ τ ], we have Z ˜ τ ∞ X k =0 k z t ′ + k ˜ τ k dt ′ ≤ Z ˜ τ V ( z , ρ dt ′ Z ∞ k z t ′ k dt ′ ≤ V ( z , τρ z T (cid:16) Z ∞ (cid:0) Ψ Pt ′ (cid:1) T Ψ Pt ′ dt ′ (cid:17) z ≤ V ( z , τρ = ˜ τ z T P − z ρ (15)Since (15) is true for all initial conditions z , Z ∞ (cid:0) Ψ Pt ′ (cid:1) T Ψ Pt ′ dt ′ ≤ ˜ τρ P − (16)which completes the proof. (cid:3) Almost sure convergence of the conditional expectation.
To discuss the convergence of condi-tional expectation, we follow the method set forth in [17]. Consider two solutions ( ˆ X t , P t ), ( X ¯ m, ¯ Pt , P ¯ Pt ) of(3) and (4) with different initial conditions: one correct, m , P (which are the mean and covariance of theGaussian x ) and the other incorrect, ¯ m, ¯ P , respectively. Our result concerning the asymptotic stability ofthe filter mean is as follows: Theorem 3.6.
Let P , ¯ P be some bounded non-negative definite matrices, and [ A t , C t ] be uniformly com-pletely observable, then k ˆ X t − X ¯ m, ¯ Pt k t →∞ −−−→ P − a.s Proof.
Let us begin with defining the innovations process dν t := dy t − C t ˆ X t dt, which is a Y t -Brownian motion [22]. Then, the using (3), we see that the dynamical equation for ˆ X − X ¯ m, ¯ P is d ( ˆ X t − X ¯ m, ¯ Pt ) = (cid:0) A t − P ¯ Pt C Tt R − t C t (cid:1) ( ˆ X t − X ¯ m, ¯ Pt ) dt + (cid:0) P t − P ¯ Pt (cid:1) C Tt R − t ( dy t − C t ˆ X t dt ) . (17)Using a simple application of Ito’s formula, we observe that solution to he above dynamical equation is givenby (cid:0) ˆ X − X ¯ m, ¯ P (cid:1) t = Ψ ¯ Pt ( m − ¯ m ) + Z t Ψ ¯ Pt (cid:0) Ψ ¯ Ps (cid:1) − (cid:0) P s − P ¯ Ps (cid:1) C Ts R − s dν s Next, writing ˆ Z t := R t (cid:0) Ψ ¯ Ps (cid:1) − (cid:0) P s − P ¯ Ps (cid:1) C Ts R − s dν s , we can express the above solution in a compact formas (cid:0) ˆ X − X ¯ m, ¯ P (cid:1) t = Ψ ¯ Pt ( m − ¯ m ) + Ψ ¯ Pt ˆ Z t . (18)Observe now that using (7) to write ( P s − P ¯ Ps ) in terms of Ψ P s and Ψ ¯ Ps , it is clear that, E [ | ˆ Z t | ] = E (cid:20) tr (cid:16) Z t ( P − ¯ P ) (cid:0) Ψ P s (cid:1) T C Ts R − s R − s C s Ψ P s ( P − ¯ P ) ds (cid:17)(cid:21) , (19)where tr( A ) denotes the trace of the square matrix A . Using simple algebra, we can easily conclude that forsome M ′ >
0, we have ( P − ¯ P ) < M ′ I . In particular, we could choose M ′ to be the squared sum of thelargest eigenvalues of P and ¯ P . Moreover, we also have k C Tt R − t C t k < M , for some M >
0, thus implying r (cid:16) Z t ( P − ¯ P ) (cid:0) Ψ P s (cid:1) T C Ts R − s R − s C s Ψ P s ( P − ¯ P ) ds (cid:17) ≤ M M ′ tr (cid:16) Z t (cid:0) Ψ P s (cid:1) T Ψ P s ds (cid:17) ≤ M M ′ tr (cid:16) Z ∞ (cid:0) Ψ P s (cid:1) T Ψ P s ds (cid:17) < ∞ , where the last inequality follows from Theorem (3.4), indicating that ˆ Z t is a square integrable martingale.Therefore, by martingale convergence theorem, { ˆ Z t } t ≥ converges almost surely, as t → ∞ , to an integrablerandom variable, say N . Thus, we conclude that Ψ ¯ Pt ˆ Z t → P − a.s , since we already know by Theorem(3.4) that Ψ ¯ Pt converges to zero as t → ∞ . Similarly, we can deduce that Ψ ¯ Pt ( m − ¯ m ) →
0, as t → ∞ ,which in view of (18) completes the proof. (cid:3) Linear filter with non-Gaussian initial condition
In this section, we will consider the filter stability in the case of non-Gaussian initial conditions. Notethat if x is not Gaussian, then π t is not Gaussian either. But the following theorem shows that, undercertain conditions, the linear filter even with non-Gaussian initial condition is asymptotically close to anappropriate Kalman-Bucy filter almost surely. To state the theorem below, we recall that X ¯ m, ¯ Pt , P ¯ Pt denotethe solutions of (3) and (4) with initial conditions ¯ m and ¯ P . Theorem 4.1.
Suppose the pair [ A t , C t ] is uniformly completely observable. Let x be square integrable andbe of the form x := v + ¯ x , where ¯ x is a non-degenerate Gaussian random variable independent of v .Then for the system given by (1) and (2) , the filter mean E [ x t |Y t ] is almost surely asymptotically proximalto the filter mean X ¯ m, ¯ Pt : E [ x t |Y t ] − X ¯ m, ¯ Pt t →∞ −−−→ , P − a.s. (20) We also have the almost sure weak asymptotic proximality (or merging, following the terminology of [9] ) ofthe filtering distributions π t with the Gaussian distributions defined by solutions of Kalman-Bucy equations: π t ( g ) − N ( X ¯ m, ¯ Pt , P ¯ Pt )( g ) t →∞ −−−→ , P − a.s, (21) for any bounded, uniformly continuous g , for any ¯ m ∈ R m and ¯ P ∈ R m × m , ¯ P > . Remark 4.2.
The requirement that the initial condition x be a sum of a Gaussian and a non-Gaussianrandom variables is not very restrictive. One quite large class of random variables that satisfy this assumptionis as follows: for every m -dimensional random variable U with finite second moment and a density f U , thereis a corresponding x satisfying the assumptions of the theorem, where x is defined to be a random variablewith density which is a solution of m -dimensional heat equation initialised with f U .Proof. The ideas of our proof are motivated by [15] and by those used in the proof of [17, Theorem 2.6],with certain modifications to accommodate our model with zero noise.First, observe that the system given by (1) and (2) can also be represented as x t = ¯ x t + Φ t v , ¯ x t = Φ t ¯ x y t = Z t C s ¯ x s ds + W t + Z t C s Φ s v ds Here, ¯ W t := W t + R t C s Φ s v ds is not a Brownian motion with respect to P . Hence we invoke a changeof measure transformation to find a new probability measure ¯ P , with respect to which ¯ W t is a Brownianmotion. By introducing such a transformation we can use much of the analysis related to Gaussian initialconditions with appropriate modifications. The authors in [15] and [17] use precisely this idea, to analysethe case of non-Gaussian initial conditions in their works.Let us begin with defining Z t for t > t := exp (cid:0) − Z t ( C s Φ s v ) T dW s − Z t k C s Φ s v k ds (cid:1) , and define a new measure ¯ P T for some fixed T > d ¯ P T d P := Z − T where ¯ P T defined as above is a probability measure (by [14, Corollary 3.5.16]) and equivalent to P forall T < ∞ . Notice that the variables v , ¯ W t , ¯ x t are all mutually independent under ¯ P T , and that thedistribution of v remains unchanged. Denoting the expectation with respect to ¯ P T by ¯ E , we see that theexpectations with respect to the two probability measures are related by ([22]): E [ f ( v , ¯ x t ) |Y t ] = ¯ E [ f ( v , ¯ x t ) Z t |Y t ]¯ E [ Z t |Y t ](22)for any bounded measurable function f : R m × R m → R . Writing π for the distribution of v , it is easy seethat ¯ E [ f ( v , ¯ x t ) Z t |Y t ] = Z R m π ( dx ) Z R m f ( x, r ) e − x T M t x + x T r η t ( dr , dr ) , (23)where, M t := R t Φ Ts C Ts C s Φ s ds , b t := R t ( C s Φ s ) T d ¯ W s , and η t is the conditional distribution of (cid:18) ¯ x t b t (cid:19) given Y t under ¯ P T . The conditional distribution η t is obtained by studying Kalman-Bucy filter in the framework withcorrelated observation and system noises for the extended system, (cid:18) ¯ x t b t (cid:19) . It is known that the conditionaldistribution η t is again Gaussian [22], with mean (cid:18) ˜ m t ˜ b t (cid:19) and covariance (cid:18) ˜ P t S t S Tt Q t (cid:19) given by the followingset of equations.˜ m t = X m ′ ,P ′ t (solution of (3)) , ˜ m = m ′ = E [¯ x ] , ˜ P t = P P ′ t (solution of (4)) , ˜ P = P ′ = E [(¯ x − E [¯ x ])(¯ x − E [¯ x ]) T ] ,d ˜ b t = (Φ t + S t ) T C Tt ( dy t − C t ˜ m t dt ) , ˜ b = 0 , ˙ Q t = − Φ t C Tt C t S t − S Tt C Tt C t Φ t − S Tt C Tt C t S t , Q = 0 , ˙ S t = A t S t − ˜ P t C Tt C t S t − ˜ P t C Tt C t Φ t , S = 0 . (24)Back to computing expectations, we use (23) in (22) to express E [ f ( v , ¯ x t ) |Y t ] = R R m π ( dx ) R R m f ( x, r ) e − x T M t x + x T r η t ( dr , dr ) R R m π ( dx ) R R m e − x T M t x + x T r η t ( dr , dr )= R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) R R m f ( x, r )˜ η t ( dr , dr ) R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) R R m ˜ η t ( dr , dr )= R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) R R m f ( x, r )˜ η t ( dr , dr ) R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx )(25)where ˜ η t is a Gaussian measure with mean (cid:18) ˜ m t + S t x ˜ b t + Q t x (cid:19) and covariance (cid:18) ˜ P t S t S Tt Q t (cid:19) . Setting f ( v , ¯ x t ) =˜ f (Φ t v + ¯ x t ), and taking γ t to be a Gaussian measure with mean 0 and covariance ˜ P t , we have [ ˜ f (Φ t v + ¯ x t ) |Y t ] = R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) R R m ˜ f (Φ t x + r )˜ η t ( dr , dr ) R R m e − x T ( Q t − M t ) x + x T ˜ b t π ( dx )= R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) R R m ˜ f (Φ t x + ˜ m t + S t x + r ) γ t ( dr ) R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) . (26)Now setting ˜ f ( x ) = x (this can be done even though ˜ f is not bounded because ˜ f is integrable with respectto Gaussian measure), we obtain the conditional mean as E [ x t |Y t ] = ˜ m t + E [(Φ t + S t ) v |Y t ] = ˜ m t + (Φ t + S t ) E [ v |Y t ]Now observe from (24) that ddt (Φ t + S t ) = ( A t − ˜ P t C Tt C t )(Φ t + S t ) , which has the same form as (8), and thus from Theorem (3.4) it follows that k Φ t + S t k → t → ∞ . k E [ x t |Y t ] − ˜ m t k = k (Φ t + S t ) E [ v |Y t ] k≤ K k (Φ t + S t ) k P − a.s. t →∞ −−−→ P − a.s. , because E [ v |Y t ] is uniformly integrable (square integrable, in particular). Therefore, E [ x t |Y t ] − ˜ m t → P − a.s. and in L Now, if we can prove that ( X ¯ m, ¯ Pt − ˜ m t ) → , P − a.s then we shall have shown that E [ x t |Y t ] − X ¯ m, ¯ Pt → , P − a.s To that end, consider d ( ˜ m t − X ¯ m, ¯ Pt ) = ( A t − P ¯ Pt C Tt C t )( ˜ m t − X ¯ m, ¯ Pt ) dt + ( ˜ P t − P ¯ Pt ) C Tt ( dy t − C t E [ x t |Y t ])+ ( ˜ P t − P ¯ Pt ) C Tt C t ( E [ x t |Y t ] − ˜ m t ) dt whose solution can be expressed as˜ m t − X ¯ m, ¯ Pt = Ψ ¯ Pt ( ˜ m − ¯ X ) + Z t Ψ ¯ Pt (cid:0) Ψ ¯ Ps (cid:1) − ( ˜ P s − P ¯ Ps ) C Ts ( dy s − C s E [ x s |Y s ])+ Z t Ψ ¯ Pt (cid:0) Ψ ¯ Ps (cid:1) − ( ˜ P s − P ¯ Ps ) C Ts C s ( E [ x s |Y s ] − ˜ m s ) ds = J + J + J , where J = Ψ ¯ Pt ( ˜ m − ¯ X ), J = R t Ψ ¯ Pt (cid:0) Ψ ¯ Ps (cid:1) − ( ˜ P s − ¯ P s ) C Ts ( dy s − C s E [ x s |Y s ]), and J = R t Ψ ¯ Pt (cid:0) Ψ ¯ Ps (cid:1) − ( ˜ P s − ¯ P s ) C Ts C s ( E [ x s |Y s ] − ˜ m s ) ds . In view of Theorem 3.4, it is easy to check that J → J → P − a.s .Thus, consider = Z t Ψ ¯ Pt (cid:0) Ψ ¯ Ps (cid:1) − ( ˜ P s − P ¯ Ps ) C Ts C s ( E [ x s |Y s ] − ˜ m s ) ds = Ψ ¯ Pt ( P ′ − ¯ P ) Z t (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s E [ v |Y s ] ds = Ψ ¯ Pt ( P ′ − ¯ P ) Z t (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ( E [ v |Y s ] − E [ v |Y ∞ ]) ds + Ψ ¯ Pt ( P ′ − ¯ P ) Z t (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ds E [ v |Y ∞ ]= L + L , where, L = Ψ ¯ Pt ( P ′ − ¯ P ) Z t (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ( E [ v |Y s ] − E [ v |Y ∞ ]) dsL = Ψ ¯ Pt ( P ′ − ¯ P ) Z t (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ds E [ v |Y ∞ ]Again, using the uniform bound on C s and theorem 3.4, it is clear that L → P − a.s . In order to showthat L → P − a.s , It suffices to show that k R t (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ( E [ v |Y s ] − E [ v |Y ∞ ]) ds k < ∞ . To thatend, we know that for a given ǫ >
0, there is a t ǫ > t > t ǫ , k E [ v |Y s ] − E [ v |Y ∞ ] k < ǫ . (cid:13)(cid:13)(cid:13)(cid:13)Z t (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ( E [ v |Y s ] − E [ v |Y ∞ ]) ds (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13)(cid:13)Z t ǫ (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ( E [ v |Y s ] − E [ v |Y ∞ ]) ds (cid:13)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13)(cid:13)Z tt ǫ (cid:0) Ψ P ′ s (cid:1) T C Ts C s Ψ P ′ s ( E [ v |Y s ] − E [ v |Y ∞ ]) ds (cid:13)(cid:13)(cid:13)(cid:13) < ∞ uniformly in t , by (16)Therefore, the above calculation implies that ( X ¯ m, ¯ Pt − ˜ m t ) → , P − a.s . This concludes the proof of (20).We again follow the method of [17] to next prove (21). To this end, consider the optimal filteringdistribution π t (recall π t ( B ) = E [ x t ∈ B |Y t ]) and the Gaussian ˜ µ t := N ( ˜ m t , ˜ P t ). For a bounded uniformlycontinuous function g : R n → R , using the expression from (26), Z R n g ( x ) π t ( dx ) − Z R n g ( x )˜ µ t ( dx )= R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) R R m g (Φ t x + ˜ m t + S t x + r ) γ t ( dr ) R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) − Z R n g ( x )˜ µ t ( dx )= R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) R R m [ g (Φ t x + ˜ m t + S t x + r ) − g ( ˜ m t + S t r )] γ t ( dr ) R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ) , (27)where the last line is obtained by using definition of γ t ( dx ) and by multiplying and the second term inthe second line above by R R m e x T ( Q t − M t ) x + x T ˜ b t π ( dx ). Now, if we partition the π ( dx ) integral into regions | (Φ t + S t ) x | < δ and | (Φ t + S t ) x | ≥ δ for a fixed δ >
0, then Z R n g ( x ) π t ( dx ) − Z R n g ( x )˜ µ t ( dx ) ≤ sup | z − z | <δ | g ( z ) − g ( z ) | + 2 sup z ( g ( z )) E [ | (Φ t + S t ) x | >δ |Y t ] ≤ sup | z − z | <δ | g ( z ) − g ( z ) | + 2 sup z ( g ( z )) δ k Φ t + S t k E [ | x | ] ≤ sup | z − z | <δ | g ( z ) − g ( z ) | as t → ∞ here the second inequality follows from Chebyshev’s inequality. Observe now that for any δ >
0, we canchoose sufficiently small δ , such that sup | z − z | <δ | g ( z ) − g ( z ) | < δ which implies that π t converges weaklyto ˜ µ t P − a.s. as t → ∞ . Using now the fact that ( X ¯ m, ¯ Pt − ˜ m t ) → , P − a.s and ( P ¯ Pt − ˜ P t ) →
0, we concludethat [ π t ( g ) − N ( X ¯ m, ¯ Pt , P ¯ Pt )( g )] t →∞ −−−→ , P − a.s . (cid:3) Remark 4.3.
We have shown that the optimal filter is asymptotically proximal to the Gaussian distributionwith mean and covariance given by solutions of (3) and (4) for arbitrary initial conditions. In contrast tothe results in [17] , our methods are not sufficient to prove the exponential convergence in our case of zerosystem noise. Small noise analysis
In this section, we would like to study the small system noise behaviour of linear filter. We initialisethe system with noise and zero noise system with same initial conditions and study the behaviour of bothsolutions for same set of observations. Consider the processes given by, x ǫt = x ǫ + Z t A s x ǫs ds + ǫ Z t F s dV ǫs , (28) y ǫt = Z t C s x ǫs ds + Z t R s dW ǫs , (29) x ǫ ∼ N ( m , P ) t ≥ , where, V ǫt and W ǫt are mutually independent F t -standard Brownian motion. Also, x is mutually independentwith respect to V ǫt and W ǫt . d ˆ x ǫt = A t ˆ x ǫt dt + Q ǫt C Tt R − t ( dy ǫt − C t ˆ x ǫt dt ) , (30) ˙ Q ǫt = A t Q ǫt + Q ǫt A Tt − Q ǫt C Tt R − t C t Q ǫt + ǫ F t F Tt , (31) ˆ x ǫ = m, Q ǫ = Q, (32)where, ˆ x ǫt = E [ x ǫt |Y ǫt ] with Y ǫt := σ { y ǫs : 0 ≤ s ≤ t } ] and Q ǫt = E [( x ǫt − ˆ x ǫt )( x ǫt − ˆ x ǫt ) T ]. We also define the newprocess ˆ x t as d ˆ x t = A t ˆ x t dt + P Qt C Tt R − t ( dy ǫt − C t ˆ x t dt ) . Note that above definition involves y ǫt , instead of y t . To proceed further, additional assumption is made inthis analysis. Assumption 5.1. F t is uniformly bounded in t and ˙ z t = ( A t − P Qt C Tt R − t C t ) z t is exponentially stable. Sufficient conditions for the required exponential stability are given in [16].
Theorem 5.2. If [ A t , C t ] are uniformly completely observable and assumption (5.1) holds, then P (cid:0) lim ǫ → || ˆ x ǫt − ˆ x t || = 0 , ∀ t ≥ (cid:1) = 1 Proof.
Let us begin with observing ddt ( Q ǫt − P Qt )(33) = B Qt ( Q ǫt − P Qt ) + ( Q ǫt − P Qt ) (cid:0) B Qt (cid:1) T − ( Q ǫt − P Qt ) C Tt R − t C t ( Q ǫt − P Qt ) + ǫ F t F Tt , Q ǫ − P Q = 0 , (34)where, B Qt := A t − P Qt C Tt R − t C t . If we define ∆ P t := Q ǫt − P Qt and Ψ Qt is such that ˙Ψ Qt = B Qt Ψ Qt , Ψ Q = I , dt [ (cid:0) Ψ Qt (cid:1) − ∆ P t (cid:0) Ψ Qt (cid:1) − T ] = (cid:0) Ψ Qt (cid:1) − ( − ∆ P t C Tt R − t C t ∆ P t + ǫ F t F Tt ) (cid:0) Ψ Qt (cid:1) − T ∆ P t = Ψ Qt Z t (cid:0) Ψ Qs (cid:1) − ( − ∆ P s C Ts R − s C s ∆ P s + ǫ F s F Ts ) (cid:0) Ψ Qs (cid:1) − T ds (cid:0) Ψ Qt (cid:1) T ≤ ǫ Ψ Qt Z t (cid:0) Ψ Qs (cid:1) − F s F Ts (cid:0) Ψ Qs (cid:1) − T ds (cid:0) Ψ Qt (cid:1) T From the assumption of exponential stability, we have k Ψ Qt (cid:0) Ψ Qs (cid:1) − k ≤ Ke − α ( t − s ) , for some K , α and for all t ≥ s ≥
0. Therefore, 0 ≤ k ∆ P t k ≤ ǫ KF α Now, we consider the evolution equation for ˆ x ǫt − ˆ x t d (ˆ x ǫt − ˆ x t ) = B Qt (ˆ x ǫt − ˆ x t ) dt + (∆ P t ) C Tt R − t ( dy ǫt − C t ˆ x ǫt dt ) , ˆ x − ˆ x ǫ = 0 , ˆ x ǫt − ˆ x t = Z t Ψ Qt (cid:0) Ψ Qs (cid:1) − (∆ P s ) C Ts R − s ( dy ǫs − C s ˆ x ǫs ds )Define, u t := R t (cid:0) Ψ Qs (cid:1) − (∆ P s ) C Ts R − s ( dy ǫs − C s ˆ x ǫs ds ) and B t := σ { y ǫr − R r C s ˆ x ǫs ds : 0 ≤ r ≤ t } . Clearly, u t isa B t -martingale. For any given ¯ T ≥ λ >
0, applying Doob’s inequality to submartingale, | u t | , we have P (cid:0) sup ≤ t ≤ ¯ T | u t | ≥ λ (cid:1) ≤ E [ | u ¯ T | ] λ P (cid:0) sup ≤ t ≤ ¯ T | (cid:0) Ψ Qt (cid:1) − (ˆ x ǫt − ˆ x t ) | ≥ λ (cid:1) ≤ E [ | (cid:0) Ψ Q ¯ T (cid:1) − (ˆ x ǫ ¯ T − ˆ x T ) | ] λ ≤ k (cid:0) Ψ Q ¯ T (cid:1) − k ǫ √ KF M αλ P (cid:0) k (cid:0) Ψ Q ¯ T (cid:1) − k sup ≤ t ≤ ¯ T | (ˆ x ǫt − ˆ x t ) | ≥ λ (cid:1) ≤ k (cid:0) Ψ Q ¯ T (cid:1) − k ǫ √ KF M αλ s Now, choose ǫ = n , n ∈ N and λ = λ Ke α ¯ T (arbitrariness of λ is now in λ ), P (cid:0) Ke α ¯ T sup ≤ t ≤ ¯ T | (ˆ x ǫt − ˆ x t ) | ≥ λ Ke α ¯ T (cid:1) ≤ Ke α ¯ T √ KF M n αλ Ke α ¯ T P (cid:0) sup ≤ t ≤ ¯ T | (ˆ x ǫt − ˆ x t ) | ≥ λ (cid:1) ≤ ǫ √ KF M αn λ Then, using Borel-Cantelli lemma, we conclude that P (cid:0) lim n →∞ | ˆ x n t − ˆ x t | = 0 , ∀ ¯ T ≥ t ≥ (cid:1) = 1 , ∀ ¯ T ≥ (cid:3) Acknowledgements
ASR and AA would like to thank Amarjit Budhiraja and Eric S. Van Vleck for valuable discussions. ASRand AA would also like to thank The Statistical and Applied Mathematical Sciences Institute (SAMSI),Durham, NC, USA where a part of the work was completed. ASR’s visit to SAMSI was supported byInfosys Foundation Excellence Program of ICTS. SV and AA acknowledge generous support of the AIRBUSGroup Corporate Foundation Chair in Mathematics of Complex Systems. eferences [1] B. D. Anderson and J. Moore , New results in linear system stability , SIAM Journal on Control, 7 (1969), pp. 398–414.[2]
B. D. Anderson and J. B. Moore , Optimal filtering , Englewood Cliffs, 1979.[3]
M. Asch, M. Bocquet, and M. Nodet , Data assimilation: methods, algorithms, and applications , vol. 11, SIAM, 2016.[4]
A. N. Bishop and P. Del Moral , On the stability of kalman–bucy diffusion processes , SIAM Journal on Control andOptimization, 55 (2017), pp. 4015–4047.[5]
M. Bocquet, K. S. Gurumoorthy, A. Apte, A. Carrassi, C. Grudzien, and C. K. Jones , Degenerate kalman filtererror covariances and their convergence onto the unstable subspace , SIAM/ASA Journal on Uncertainty Quantification, 5(2017), pp. 304–333.[6]
R. S. Bucy , Global theory of the riccati equation , Journal of computer and system sciences, 1 (1967), pp. 349–361.[7]
A. Carrassi, A. Trevisan, L. Descamps, O. Talagrand, and F. Uboldi , Controlling instabilities along a 3dvar analysiscycle by assimilating in the unstable subspace: a comparison with the enkf , arXiv:0804.2136, (2008).[8]
F. C´erou , Long time behavior for some dynamical noise free nonlinear filtering problems , SIAM Journal on Control andOptimization, 38 (2000), pp. 1086–1101.[9]
A. D’Aristotile, P. Diaconis, and D. Freedman , On merging of probabilities , Sankhya, 50 (1988), pp. 363–380.[10]
S. J. Fletcher , Data Assimilation for the Geosciences: From Theory to Application , Elsevier, 2017.[11]
K. S. Gurumoorthy, C. Grudzien, A. Apte, A. Carrassi, and C. K. Jones , Rank deficiency of kalman error covariancematrices in linear time-varying system with deterministic evolution , SIAM Journal on Control and Optimization, 55 (2017),pp. 741–759.[12]
A. H. Jazwinski , Stochastic processes and filtering theory , Courier Corporation, 2007.[13]
R. E. Kalman and R. S. Bucy , New results in linear filtering and prediction theory , Journal of basic engineering, 83(1961), pp. 95–108.[14]
I. Karatzas and S. Shreve , Brownian motion and stochastic calculus , vol. 113, Springer Science & Business Media, 2012.[15]
A. M. Makowski , Filtering formulae for partially observed linear systems with non-gaussian initial conditions , Stochastics:An International Journal of Probability and Stochastic Processes, 16 (1986), pp. 1–24.[16]
B. Ni and Q. Zhang , Stability of the kalman filter for continuous time output error systems , Systems & Control Letters,94 (2016), pp. 172–180.[17]
D. Ocone and E. Pardoux , Asymptotic stability of the optimal filter with respect to its initial condition , SIAM Journalon Control and Optimization, 34 (1996), pp. 226–243.[18]
L. Palatella, A. Carrassi, and A. Trevisan , Lyapunov vectors and assimilation in the unstable subspace: theory andapplications , Journal of Physics A: Mathematical and Theoretical, 46 (2013), p. 254020.[19]
S. S¨arkk¨a , Bayesian filtering and smoothing , vol. 3, Cambridge University Press, 2013.[20]
S. Sastry and M. Bodson , Adaptive control: stability, convergence and robustness , Courier Corporation, 2011.[21]
A. Trevisan and L. Palatella , On the kalman filter error covariance collapse into the unstable subspace , NonlinearProcesses in Geophysics, 18 (2011), pp. 243–250.[22]
J. Xiong , An introduction to stochastic filtering theory , vol. 18, Oxford University Press on Demand, 2008., vol. 18, Oxford University Press on Demand, 2008.