[PDF] Kalman Filter from the Mutual Information Perspective

Abstract

Kalman filter is a best linear unbiased state estimator. It is also comprehensible from the point view of the Bayesian estimation. However, this note gives a detailed derivation of Kalman filter from the mutual information perspective for the first time. Then we extend this result to the R\'enyi mutual information. Finally we draw the conclusion that the measurement update of the Kalman filter is the key step to minimize the uncertainty of the state of the dynamical system.

Full PDF

aa r X i v : . [ c s . I T ] J a n Kalman Filter from the Mutual Information Perspective

Yarong Luo, [email protected] Hu, [email protected] Guo, [email protected] Research Center, Wuhan University

Abstract

Kalman ﬁlter is a best linear unbiased state estimator. It is also comprehensible from the pointview of the Bayesian estimation. However, this note gives a detailed derivation of Kalman ﬁlterfrom the mutual information perspective for the ﬁrst time. Then we extend this result to the R´enyimutual information. Finally we draw the conclusion that the measurement update of the Kalmanﬁlter is the key step to minimize the uncertainty of the state of the dynamical system.

Key Words

Kalman ﬁlter, mutual information, R´enyi mutual information, uncertainty, measurement update

1. Introduction

Kalman ﬁlter has been widely in various ﬁelds as an effective state estimator for integrated navi-gation [1], robotics [2], etc. The classical Kalman ﬁlter can be derived as a best linear unbiasedestimate [1] and it is easy understand it from the probabilistic perspective [2]. Recently, Kalmanﬁlter has also been presented using the methods of maximum relative entropy [3] and the temporalderivative of the R´enyi entropy [4], which go beyond the general Bayesian ﬁlter. More and moreevidences show that Kalman ﬁlter can be regarded as a direct extension of information theory.This note gives a new perspective of the Kalman ﬁlter from the mutual information, which furtherbridges the gap between the optimal state estimation and the information theory. The main con-ribution of this note is to derive the Kalman ﬁlter from the perspective of mutual information andextend it to the R´enyi mutual information case.

2. Kalman Filter from the Mutual Information

For following discrete-time state-space model X k = Φ k | k − X k − + Γ k | k − W k − (1) Z k = H k X k + V k (2)where X k is n-dimensional state vector; Z k is m-dimensional measurement vector; Φ k | k − , Γ k | k − and H k are the known system structure parameters, which are called the n × n dimensional one-step state update matrix, the n × l dimensional system noise distribution matrix, and the m × n dimensional measurement matrix, respectively; W k − is the l -dimensional system noise vector,and V k is the m-dimensional measurement noise vectors. Both of them are Gaussian noise vectorsequences with zero mean value, and independent to each other: E [ W k ] = 0 , E [ W k W Tj ] = Q k δ kj (3) E [ V k ] = 0 , E [ V k V Tj ] = R k δ kj (4) E [ W k V Tj ] = 0 (5)The one-step prediction covariance matrix is denoted as Σ k | k − . The state estimation at t k isdenoted as N ( ˆ X k , Σ k ) , where ˆ X k is the mean of estimated state and Σ k is the covariance matrixof the estimated covariance matrix. Assuming the optimal estimation of the state can be calculatedas follows: ˆ X k = X − k | k − + K k ˜ Z k | k − (6)where K k is the undetermined correction factor matrix, X − k | k − = X k | k − − ˆ X k − is the stateestimation error, ˜ Z k | k − = Z k − H k X − k | k − is the measurement one-step prediction error.Then, the mean square error matrix of state estimation ˆ X k is given by [1] Σ k = ( I − K k H k )Σ k | k − ( I − K k H k ) T + K k R k K Tk (7)2he mean square error matrix Σ k is positive deﬁnite as ( I − K k H k )Σ k | k − ( I − K k H k ) T is positivedeﬁnite and K k R k K Tk is positive deﬁnite.The joint Gaussian distribution can be expressed p ( X, Y ) ∼  ˆ X ˆ Y  ,  Σ xx Σ xy Σ yx Σ yy  (8)where X ∼ N ( ˆ X, Σ xx ) , Y ∼ N ( ˆ Y , Σ yy ) .The mutual information for a joint Gaussian PDF can be represented by I ( X, Y ) = H ( X ) + H ( Y ) − H ( X, Y ) = H ( X ) − H ( X | Y )= 12 ln((2 πe ) N det Σ xx ) + 12 ln((2 πe ) M det Σ yy ) −

12 ln((2 πe ) M + N det Σ)= −

12 ln (cid:18) det Σdet Σ xx det Σ yy (cid:19) (9)where H ( X ) = ln((2 πe ) N det Σ xx ) is the entropy of a Gaussian random variable, and det Σ = det Σ xx det (cid:0) Σ yy − Σ yx Σ − xx Σ xy (cid:1) = det Σ yy det (cid:0) Σ xx − Σ xy Σ − yy Σ yx (cid:1) (10)Therefore, the mutual information describes the reduction of the uncertainty in variable X dueto gaining knowledge of variable Y.Similarly, the mutual information at time step t k +1 can be easily computed by the a priori, aposteriori PDF and the Kalman gain K k as I ( ˆ X k | k − , Z k ) = 12 ln (cid:18) det Σ k | k − det Σ k (cid:19) = 12 ln det Σ k | k − det (cid:0) ( I − K k H k )Σ k | k − ( I − K k H k ) T + K k R k K Tk (cid:1) ! (11)It describes the reduction of the uncertainty of the state due to gaining knowledge from themeasurement Z k . Consequently, we want to maximize the mutual information I ( ˆ X k | k − , Z k ) . It isobvious that the maximum of the mutual information is equivalent to the minimum of ln det Σ k .We can note that equation (11) is the function of the unknown factor matrix and thereby the maxi-mization of it can be calculated by taking derivative of it with respect to K k and setting it equal tozero, dI ( ˆ X k | k − , Z k ) dK k = − Σ − Tk d Σ k dK k = − Σ − Tk (cid:0) − I − K k H k )Σ k | k − H Tk + 2 K k R k (cid:1) = 0 (12)3here ∂ ln det X∂X = X − T [5] has been used and then solving for K k gives K k = Σ k | k − H Tk ( H k Σ k | k − H Tk + R k ) − (13)A subsequent derivative of equation (12) must be performed to check for a maximum, that is ddK k (cid:0) − Σ − Tk (cid:0) − I − K k H k )Σ k | k − H Tk + 2 K k R k (cid:1)(cid:1) (14)After substituting the equation (12) into above equation results in ddK k (cid:0) − Σ − Tk (cid:0) − I − K k H k )Σ k | k − H Tk + 2 K k R k (cid:1)(cid:1) = − Σ − Tk ( H k Σ k | k − H Tk + R k ) (15)which is always negative deﬁnite by the deﬁnition of the covariance matrices R k and H k Σ k | k − H Tk ,ensuring the solution for K k is a maximum.

3. Kalman Filter from the R´enyi Mutual Information

Moreover, the R´enyi mutual information of a joint Gaussian PDF can be calculated similar toequation (9): I αR ( X, Y ) = H αR ( X ) + H αR ( Y ) − H αR ( X, Y )= 12 ln | (2 π ) N α Nα − det Σ xx | + 12 ln | (2 π ) M α Mα − det Σ yy | −

12 ln | (2 π ) N + M α N + Mα − det Σ | = 12 ln (cid:18) det Σ xx det Σ yy det Σ (cid:19) = I ( X, Y ) (16)where H αR ( X ) = ln | (2 π ) N α Nα − det Σ xx | is the R´enyi entropy of order α for a continuous Gaus-sian random variable with a multivariate Gaussian PDF. Consequently, we can know that the mutualinformation is the same as the R´enyi mutual information for the joint Gaussian PDF. Similarly, wecan get the same result as equation (13) from the R´enyi mutual information.

4. Conclusions

In this paper, Kalman ﬁlter is derived from the perspective of mutual information and extended tothe R´enyi mutual information case. We show that the measurement update of the Kalman ﬁlter4an minimize the uncertainty of the state by formulating it as the mutual information between theevolving state and the measurement and maximizing the mutual information. Furthermore, we canthink of Kalman ﬁlter a little more radically as an extension of the information theory.

Acknowledgement

This research was supported by a grant from the National Key Research and Development Programof China (2018YFB1305001). We express thanks to GNSS Center, Wuhan University.

References [1] Y. Gongmin and W. Jun,

Lectures on Strapdown Inertial Navigation Algorithm and IntegratedNavigation Principles . Northwestern Polytechnical University Press: Xi’an, China, 2019.[2] S. Thrun, W. Burgard, and D. Fox,

Probabilistic Robotics . MIT Press, 2005.[3] A. Gifﬁn and R. Urniezius, “The kalman ﬁlter revisited using maximum relative entropy,”

Entropy , vol. 16, no. 2, pp. 1047–1069, 2014.[4] Y. Luo, C. Guo, S. You, and J. Liu, “A novel perspective of the kalman ﬁlter from the r´enyientropy,”

Entropy , vol. 22, no. 9, p. 982, 2020.[5] X.-D. Zhang,