科技部補助專題研究計畫成果報告
期末報告
動態系統的貝氏分析(第 2 年)
計 畫 類 別 : 個別型計畫
計 畫 編 號 : MOST 102-2118-M-004-003-MY2
執 行 期 間 : 103 年 08 月 01 日至 104 年 08 月 31 日
執 行 單 位 : 國立政治大學統計學系
計 畫 主 持 人 : 翁久幸
計畫參與人員: 碩士班研究生-兼任助理人員:林冠謀
碩士班研究生-兼任助理人員:楊智博
碩士班研究生-兼任助理人員:陳弘叡
報 告 附 件 : 出席國際會議研究心得報告及發表論文
處 理 方 式 :
1.公開資訊:本計畫可公開查詢
2.「本研究」是否已有嚴重損及公共利益之發現:否
3.「本報告」是否建議提供政府單位施政參考:否
中 華 民 國 104 年 09 月 08 日
中 文 摘 要 : 在過去數年,我們已經相當程度探索 Woodroofe-Stein 等式
及其在序貫分析和貝式統計之應用。本計畫著重在動態模型
之應用及其與卡爾曼 fileter 之關聯 。我們得到一種新的方
法來推導卡爾曼增益(Kalman gain)。本計畫之副產品是將目
前 Woodroofe-Stein 等式進一步推展, 得到多元的
Gram-Charlier 級數。
中文關鍵詞: 貝氏分析, 動態系統, 卡爾曼濾波, Woodroofe-Stein 等式
英 文 摘 要 : In the past years we have explored Woodroofe-Stein`s
identity and its applications to sequential analysis
and Bayesian statistics. This project focused on its
application to dynamic models and relation to Kalman
fileter. We provide a new approach to obtain the
Kalman gain. A by-product of this project is to take
the present Woodroofe-Stein`s identity one step
further to obtain the multivariate Gram-Charlier
series.
英文關鍵詞: Bayesian inference, dynamic systems, Kalman filter,
Woodroofe-Stein`s identity
Final report for project: Bayesian inference for some
dynamic systems
1
Introduction
The present project explored Woodroofe-Stein’s identity and its applications to dynamic systems. First, it is shown that the current version of Woodroofe-Stein’s identity can be taken one-step further to a series, from which a Gram-Charlier type expansion for multivari-ate densities can be obtained. This result has appeared [3]. Secondly, for the applications to dynamic systems, it is shown that this identity can give a novel derivation to the Kalman gain. The details are given in next section.
2
Kalman filter revisited
The Kalman filter [2] is a recursive method that estimate the latent state of a linear dynamic system. Consider the following linear dynamic model:
Xt = AXt−1+ Wt−1 (1)
Yt = HXt+ Vt (2)
where Xt is unobserved state variable and Yt the measurement variable, both at time t.
Suppose that Xtand Wtare n-dimensional vectors, Ytand Vt are m-dimensional vectors, A
is an n×n matrix, H is an m×n matrix, Wt∼ N (0, Q), Vt∼ N (0, R), Q is n×n, R is m×m,
and both Wt and Vt are independent of Xt, A, H, Q, R are known. Let Dt = {Y1, ..., Yt},
the collection of data up to time t. The state of the system can be represented by the conditional mean and conditional covariance. Before observing the measurement Yt, the a
priori estimate of Xt and the error covariance matrix are defined as
ˆ
Xt− = E(Xt|Dt−1)
Pt− = E[(Xt− ˆXt−)(Xt− ˆXt−) T|D
t−1]. (3)
Given knowledge of Yt, the a posteriori state estimate and the a posteriori estimate error
covariance are
ˆ
Xt= E(Xt|Dt) (4)
Prediction step ˆ Xt−= A ˆXt−1 Pt−= APt−1AT + Q Update step Kt= Pt−HT HP − t HT + R −1 ˆ Xt= ˆXt−+ Kt(Yt− H ˆXt−) Pt= (In− KtH)Pt−
Table 1: Kalman filter
The Kalman filter propagates the state variable from time t − 1 to time t. The filter consists of two steps: the prediction step and the update step. The prediction step infers from ( ˆXt−1, Pt−1) to ( ˆXt−, P
−
t ), and the update step from ( ˆX − t , P
−
t ) to ( ˆXt, Pt). The prediction
and update equations are given in Table 1. The n × m matrix Ktin the update step is called
the Kalman gain. From the state equation (1) and the prior knowledge ( ˆXt−1, Pt−1) on Xt−1,
the prediction equations can be easily obtained. The derivation of update equations are more complicated. It can be derived by first applying Bayes rule to the posterior density of Xt given Dt, expanding the numerator and denominator in the expression, rearranging
all the terms, and employing the Matrix Inversion Lemma. Another approach starts by writing the a posteriori state estimate ˆXt as a linear combination of the a priori estimate
ˆ
Xt− and the difference between the actual measurement Yt and its prediction H ˆXt−,
ˆ
Xt= ˆXt−+ Kt(Yt− H ˆXt−), (6)
and then determining the n × m matrix Kt by minimizing the mean-squared error
E((Xt− ˆXt)T(Xt− ˆXt)|Dt).
The minimization involves substituting (6) into (5), differentiating the trace of Pt with
respect to Kt, and setting the derivative equal to zero to solve Kt. The resulting Kt is
Kt= Pt−HT HP −
t HT + R
−1
. (7)
With the Kalman gain Kt, the posterior covariance matrix can be derived rather
straight-forwardly. To begin, write
Pt= E[(Xt− ˆXt)(Xt− ˆXt)T|Dt]
= E{[(Xt− ˆXt−) − Kt(HXt+ Vt− H ˆXt−)][(Xt− ˆXt−) − Kt(HXt+ Vt− H ˆXt−)]T|Dt}
= Pt−− Pt−HTKtT − KtHPt−+ Kt(HPt−HT + R)KtT, (8)
where the last line follows from (3) and the independence of Xt and Vt. Then, substituting
(7) into (8) gives the update of the error covariance estimate Pt= (I − KtH)Pt−.
For details of this approach, see Brown and Hwang [1].
Now we show how to derive the update step by Stein’s equation. First, write the posterior density of Xt given Dt as
p(xt|Dt) = p(xt|yt, Dt−1) ∝ p(xt|Dt−1)p(yt|xt, Dt−1)
= prior × likelihood ∝ φ(xt; ˆx−t, P
−
t )p(yt|xt).
Next, let Σt satisfy (Pt−)−1 = ΣTtΣt and define
Zt= Σt(Xt− ˆXt−). (9)
So, Zt∼ N (0, In), where In is the n × n identity matrix. The posterior density of Zt given
Dt is p(zt|Dt) ∝ f (zt)φn(zt) (10) where f (zt) = e− 1 2(yt−HΣ −1 t zt−H ˆx−t)TR−1(yt−HΣ−1t zt−H ˆx−t). (11)
The density (10) is of the form for Stein’s equation. Therefore, E(Zt|Dt) = E ∇ztf (Zt) f (Zt) Dt = E(Σ−1t )THTR−1(Yt− HΣ−1t Zt− H ˆXt−)|Dt ; and collecting terms involving E(Zt|Dt) yields
1 + (Σ−1t )THTR−1HΣ−1t E(Zt|Dt) = (Σ−1t )THTR −1(Y
t− H ˆXt−). (12)
Now, from (4), (9), (12), and the property (ABC)−1= C−1B−1A−1, we obtain ˆ Xt = E(Xt|Dt) = ˆXt−+ Σ −1 t E(Zt|Dt) = Xˆt−+ Σ−1t 1 + (Σ−1t )THTR−1HΣt−1−1(Σ−1t )THTR−1(Yt− H ˆXt−) = Xˆt−+ (Pt−)−1+ HTR−1H−1 HTR−1(Yt− H ˆXt−),
which is of the form as (6). Then, the desired expression of the a posteriori state estimate can be derived by an application of the Matrix Inversion Lemma,
(Pt−)−1+ HTR−1H−1
HTR−1 = Pt−HT HPt−HT + R−1 ≡ Kt.
Once Ktis available, the posterior covariance matrix can be derived as shown in the previous
3
Discussions
The result in the previous section is a joint work with Dr. Coad. It gives a new derivation for the Kalman gain. It will be combined with further findings and submitted later.
References
[1] R. G. Brown and P. Y. C. Hwang. Introduction to Random Signals and Applied Kalman Filtering. John Wiley & Sons, Inc., New York, 3rd edition, 1997.
[2] R. E. Kalman. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1):35–45, 1960.
[3] R. C. Weng. Expansions for multivariate densities. Journal of Statistical Planning and Inference, 167:174–181, 2015.
Report on attending The Joint Statistical Meeting, August 8 - 13, 2015, Seattle, Washington.
The Joint Statistical Meetings (JSM) is the largest statistical meeting held in North America. The JSM 2015 was held August 8-13, at the Washington State Con-vention Center. It was jointly held by organizations including American Statistical Association, Institute of Mathematical Statistics, International Chinese Statistical Association, International Society for Bayesian Analysis, Royal Statistical Society, etc. The conference puts together short courses, keynote lectures, scientific sessions, poster session, expositions, social events etc, and provides opportunities for partici-pants to engage and network, and get inspirations to develop new ideas. This year it attracted over 6,000 participants.
This year I was invited by Professor X. Wang in Department of Statistics at University of Connecticut to the topic-contributed session sponsored by Bayesian Statistical Science. My talk title is “Real-time Bayesian inference for latent ability models.” It is about Bayesian online inference for models such as paired-comparison models, item response theory models. I got a good chance to present my work and communicate with many people about my research. I also attended two professional short courses. One of them is “Applied text mining”, which is a hands-on workshop with R code and packages for the practical application of text mining to real-world applications, including data from survey comments, websites, etc. The other one is “Software Engineering for Statisticians”, which provides basics of computer architec-ture, revision control tools, code readability, etc. These materials are taught in a computer science curriculum, but seldom part of a statistics degree; however, they have become increasingly important tools for statisticians.
During these days, I met many old and new friends from industries and academics. Having chats with them inspired me and encouraged me to keep on moving. I have brought course materials from the two workshops. It was really a fruitful trip.