行政院國家科學委員會專題研究計畫 成果報告
後驗分配的近似計算(2/2)
計畫類別: 個別型計畫 計畫編號: NSC93-2118-M-004-001- 執行期間: 93 年 08 月 01 日至 94 年 07 月 31 日 執行單位: 國立政治大學統計學系 計畫主持人: 翁久幸 報告類型: 完整報告 報告附件: 出席國際會議研究心得報告及發表論文 處理方式: 本計畫可公開查詢中 華 民 國 94 年 10 月 18 日
Approximate Computations for Posterior
Distributions
NSC93-2118-M-004-001
92.8.1-94.7.31
Ruby C. Weng
Department of Statistics, National Chengchi University
October 18, 2005
Abstract
This project describes a method for approximating posterior expectations of functions of the parameter. First the posterior density of a data dependent transformation Zt of the parameter is expressed as a form close to a normal
density. Next, a version of Stein's Identity is applied to the posterior distri-bution to obtain posterior moments of Zt. Then the results are converted to
derive second-order approximations to posterior expectations of functions (not necessarily positive) of the parameter.
Key words: maximum likelihood estimator; posterior distributions; Stein's
identity.
1 Introduction
Let g() be a smooth function on the parameter space . The estimation of the posterior mean of g(), given a sample of observations x(t), requires integraion over
of the form Et [g()] = E[g()jjx(t)] = R Rg()e`t()()d e`t()()d ; (1)
where `t is the log-likelihood function. If the likelihood function has a dominant
mode, Laplace method will be suitable for approximating the integrals. Many authors have applied Laplace method to nd approximations to the ratios of integrals in (1). For example, Lindley [2] derived the second order approximation for the integral.
Tierney and Kadane [5] applied the Laplace method in a special form in which g is assumed to be positive, the integrand of the numerator in (1) is expressed as exp[`t() + logg() + log()] (called fully exponential Laplace approximations) and
is expanded at the mode of the integrand itself, rather than at the posterior mode. For a general function g (possibly non-positive), Tierney, Kass, and Kadane [6] obtained a second order expansion of the posterior expectation by applying the fully exponential
method to approximate the moment generating function Et
[exp(sg())] and then
dierentiating (called the MGF method).
In this project we present a method, based on a version of Stein's Identity, for the problem of estimating the posterior mean of a smooth function of the parameter. First the posterior density of a data dependent transformation Zt(2) of the parameter
is converted into a form close to a normal density. Next, a version of Stein's Identity is applied to the posterior distribution to obtain posterior moments of Zt. Then the
results are converted to derive second-order approximations to posterior expectations of functions (not necessarily positive) of the parameter.
2 The Model and Stein's Identity
Let Xt be a random vector distributed according to a family of probability densities
pt(xtj), where t is a discrete or continuous parameter and 2 , an open subset in
<p. Assume that the log-likelihood function, denoted by `
t(), is twice continuously
dierentiable with respect to . Throughout let ^tbe a root of the likelihood equation
satisfying r`t(^t) = 0, where r indicates dierentiation with respect to . Whenever
such a root exists and r2`
t(^t) is positively denite, we dene t and the data
dependent transformation Zt as
0
tt = r2`t(^t)
Zt = t( ^t); (2)
otherwise, dene t and Zt arbitrarily (in a measurable way).
Consider a Bayesian model in which has a prior density . Then the posterior density of given data xt is t() / e`t()(), and the posterior density of Zt is
where the relation of and z is given in (2). Now a Taylor's expansion gives `t() = `t(^t) + 12( ^t)0r2`t(t)( ^t);
where
t lies between and ^t. Let
ut() = 12( ^t)0[r2`t(^t) r2`t(t)]( ^t);
it follows that
`t() = `t(^t) 21jjztjj2+ ut() (4)
and (3) can be rewritten as
t(z) / p(z)ft(z); (5)
where ft(z) = ((z))exp[ut()] and p(z) denotes the standard p-variate normal
density.
Throughout r and r2 denote the gradient and Hessian of with respect to ,
rf and r2f the gradient and Hessian of f with respect to Z, and Et
and Vt the
conditional expectation and variance given data xt.
Stein's Identity Let p denote the standard p-variate normal distribution and write
ph =
Z hdp
for functions h for which the integral is nite. Next let denote a nite signed
measure of the form d = fdp; where f is a real-valued function dened on <p
satisfying pjfj = R jfjdp < 1: For k > 0, denote Hk as the collection of all
measurable functions h : <p ! < for which jh(z)j=b 1 + jjzjjk for some b > 0 and
H = [k0Hk. Given h 2 Hk, let h0 = ph, hp = h, hj(y1; :::; yj) = Z <p jh(y1; :::; yj; w)p j(dw); (6) and gj(y1; :::; yp) = e12yj2 Z 1 yj [hj(y1; :::; yj 1; w) hj 1(y1; :::; yj 1)]e 12w2dw; (7)
for 1 < y1; :::; yp < 1 and j = 1; :::; p. Then let Uh = (g1; :::; gp)T. Note that U
may be iterated. Let V h = (U2h + U2h0)=2, where U2h is the p p matrix whose j-th
column is Ugj and gj is as in (7). Then V h is a symmetric matrix. For example, for
z 2 <p, if h(z) = z
Lemma 2.1 (Stein0s Identity) Let r be a nonnegative integer. Suppose that d =
fdp as above, where f is a dierentiable function on <p, for which
Z <pjfjdp+ Z <p(1 + jjzjj r)jjrf(z)jj p(dz) < 1; then h = 1 ph + Z <p(Uh(z)) Trf(z) p(dz);
for all h 2 Hr. If @f=@zj; j = 1; :::; p, are dierentiable, and
Z <p(1 + jjzjj r)jjr2f(z)jj p(dz) < 1; then h = 1 ph + p(Uh)T Z <prf(z)p(dz) + Z <ptr[(V h(z))r 2f(z)] p(dz); for all h 2 Hr:
Observe from (5) that the posterior distribution of Zt is of a form suitable for
Stein's Identity. Let Btdenote the event fr`t(^t) = 0; r2`t(^t) is positively deniteg:
Suppose that has a compact support 1 2 and r is continuous. Then, jjrjj is
bounded on 1 and we can verify that
Z <pjfjdp+ Z <p(1 + jjzjj r)jjrf t(z)jjp(dz) < 1: Hence, by Lemma 2.1 Et fh(Zt)g = ph + Etf[Uh(Zt)]Trff t(Zt) t(Zt) g; (8)
a:e: on Bt, for all h 2 H. If also r2 is continuous, then similar arguments lead to
Et fh(Zt)g = ph + (pUh)TEt[rff t(Zt) t(Zt) ] + E t ftr[V h(Zt)r 2f t(Zt) ft(Zt) ]g (9)
a:e: on Bt, for all h 2 H.
3 Main Results
In this section we present approximations of posterior moments of Zt and use it to
Lemma 3.2 If h(z) = zizj, 1 i j p, then
(i) gj(z) = zi and gk(z) = 0 for k 6= j
(ii) Uh = (0; ; 0)T (iii) tr[V h(z)r2ft(z) ft(z) ] = [ r2ft(z) ft(z) ]ij: Et (r ) = r^^ + O(t 1) and Et(r 2 ) = r2^ ^ + O(t 1): (10) Theorem 3.1 (i)Et Zt = (Tt) 1[(r^^ ) + 12U] + O(t 3=2); (ii)Vt Zt = Ip+ (Tt) 1[(r 2^ ^ ) + (r^^ )(r^ T ^ ) + W ]t1+ O(t 2);
where U is a vector and W a matrix involving higher order derivatives of `.
4 Applications
4.1 Linkage example
Here we consider an example presented in Rao [3] and reexamined by Tanner and Wong [4] and references therein. From a genetic linkage model, it is believed that 197 animals are distributed multinomially into four categories, y = (y1; y2; y3; y4) =
(125; 18; 20; 34), with cell probabilities specied by (1
2 + 4;1 4 ;1 4 ;4).
Tanner and Wong [4] also consider a second version of the sata in which the sample size is reduced by a factor of 10, y = (125; 18; 20; 34). As suggested in their paper, here we choose the uniform prior for 2 (0; 1) and assess the performance of our method using both the large sample and small sample data. Table 1 reports the exact posterior means and variances of (carried out by matlab), and the approximations using our approach.
Table: Linkage example
Large sample Small sample
Method posterior mean posterior variance posterior mean posterior variance
Exact 0.6228 0.0026 0.5704 0.0225
5 Conclusions
In conclusion, we use Stein's Identity to approximate posterior moments of a suitably normalized quantity. These moments are useful in the evaluation of posterior means and variances of g(). Unlike Laplace method (for positive function), ours requires
third derivatives of `t, but we need only posterior mode for all g. Some formulas
presented here are new, while some agree with results in earlier approaches such as Johnson [1] and Tierney, Kass, and Kadane [6].
References
[1] R. Johnson. Asymptotic expansions associated with posterior distributions. Ann. Math. Statist., 41:851{864, 1970.
[2] D. V. Lindley. The use of prior probability distributions in statistical inference and decisions. Proc. 4th. Berkeley Symp., 1:453{468, 1961.
[3] C. R. Rao. Linear Statistical Inference and its Applications. John Wiley, New York, Second edition, 2001.
[4] M. A. Tanner and W. H. Wong. The calculation of posterior distributions by data augmentation. Journal of American Statistical Association, 82:528{540, 1987. [5] L. Tierney and J. B. Kadane. Accurate approximations for posterior moments
and marginal densities. Journal of American Statistical Association, 81:82{86, 1986.
[6] L. Tierney, R. E. Kass, and J. B. Kadane. Fully exponential laplace approxima-tions to expectaapproxima-tions and variances of nonpositive funcapproxima-tions. Journal of American Statistical Association, 84:710{716, 1989.