國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
4.3. INFERENCE OF CLUSTERED EFFECTS 51
where 1 is an unit vector with length n, and R is a predefined upper bound for the prior radius which depends on the distance scale of the study region.
The priors of the hyper-parameters are
ρ ∼ Unif(γl, γu) σ2η ∼ Γ(10−3, 10−3) σ2ǫ ∼ Γ(10−3, 10−3),
where (γl, γu) are the inverses of the minimum and maximum eigenvalues of M−1/2W M−1/2 (Banerjee et al., 2004, Chapter 5), where M is the diag-onal variance matrix (here is the identity matrix), and is used to make the covariance matrix of the CAR prior nonsingular.
The Bayesian estimates are executed by the software “WinBUGS” (Spiegelhalter et al., 2003) and its advanced manual “GeoBUGS” (Thomas et al., 2004). The
de-tails of the modeling codes are shown in Appendix E. Basically, the posterior samples are generated by the MCMC methods. Based on these samples, we can draw the disease surface of relative risks.
4.3 Inference of clustered effects
The most obstacle point of identifying clusters by Bayesian models is that the locations of clusters can not be clearly defined by the posterior samples.
In our model settings, the information of predefined clustered number is obtained from the result of the SaTScan, because the SaTScan is a sensitive method to detect clusters. It should be noted that the posterior samples will not be dramatically affected by the priors of the number of clusters
‧
(Gangnon & Clayton, 2000). In the case that a wrong cluster is chosen, the posterior sample of the clustered effect will not be large or can counteract each other at the same time. In this way, the clustered effect of cell i in equation (4.5) is defined as
Ψi = which (ucj, vcj) is the coordinate of the jth clustered center. In addition, it is very possible that the posterior samples of the clusters would be overlapped.
Thus, the convergence of the clustered posterior samples is not always at-tained. However, Ψi will be convergent in this setting. From this aspect, Ψi
is used to infer the clustered effects for each cell.
The exceedance probability (Richardson et al., 2004) is the most com-monly used quantity in relation to the clustered effects (or relative risks) in disease mapping studies. To determine the possible clusters in our case, we draw the posterior mean of Ψi and the estimated exceedance probability that Ψi is bigger than 0, generating samples for each cell with considering the correlated heterogeneity in the study region. The exceedance probability is defined as a threshold to check if there is any cell with an unusual higher risk. By drawing them in the map, the distribution of these unusual cells are constructed. Here, we set
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
4.3. INFERENCE OF CLUSTERED EFFECTS 53
the cells as clustered cells when the exceedance probabilities is larger than 90%.
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
Chapter 5
Simulations and Method Comparisons
Although the Bayesian clustered model has been introduced in the previous chapter, we will not discuss its performance in this chapter. Because the Bayesian method takes a lot of computing time, it is not feasible for simu-lation. However, we will discuss all the methods in the applications of the next chapter.
5.1 Results of EM estimates
Before the cluster detection results are discussed, it is necessary to check the EM estimates of the Gaussian CAR model and the auto-Poisson model. It should be reminded that the simulation data sets are generated by equation (2.2), but the data must be transformed by the Freeman–Tukey approach for Gaussian model before estimating the parameters. So, it is possible that the
55
‧
model has been biased after the transformation. We will take the transformed data with the clustered effects as the baseline to see if the EM estimates do work.
On the other hand, for the auto-Poisson model, there is no need to do any transformation. One could not anticipate having the similar results closed to the defaulted parameter space, because the generated data are widely apart from the auto-Poisson model. However, if the cluster detection results and the fitted values are primarily concerned, the auto-Poisson model can provide a solution to deal with such a problem.
Table 5.1 shows the MLEs and EM estimates of the spatial autocorrelation ρ and σ2 for the Gaussian CAR model. The MLEs are obtained by assuming that the locations of the local cluster have known and can be explained by a clustered effect. On the other hand, the EM estimates are computed after treating the clusters detected by the SaTScan as the missing values.
Table 5.1. Comparisons of MLE and EM estimates of ρ for the Gaussian CAR model Notes: For simplification reason, we didn’t show the variances of the estimates in the Table. The variances are around 0.11 for all the estimates.
In Table 5.1, the EM estimates are close to the MLEs with the real clus-tered effect except the cases when RR is lower and ρ = 0.8. Because the real
‧
simulated model (equation (2.2)) is a Poisson–Gaussian model, it is not ap-plicable in our approach and we found that the estimates are underestimated from the original settings.
Like we did for the Gaussian case, the estimated results for the auto-Poisson model are performed and compared with the null and the real models in Tables 5.2 and 5.3. The ‘Null’ means the model without considering the local clusters, and the ‘Real’ means the model with known locations of the local clusters. All the estimations are obtained based on the auto-Poisson pseudo-likelihood (equation (3.21)). The EM method estimates the model with treating the local clusters as missing values.
Table 5.2. The estimations of overall intensity rate θ0 in the auto-Poisson model
ρ Null Real EM
RR = 1 0 0.001071 0.001071 0.001072 (θc=0) 0.2 0.000918 0.000918 0.000932 0.5 0.000701 0.000701 0.000722 0.8 0.000560 0.000560 0.000566 RR = 1.49 0 0.001026 0.001122 0.001084 (θc=0.4) 0.2 0.000863 0.000962 0.000905 0.5 0.000685 0.000767 0.000722 0.8 0.000554 0.000570 0.000568 RR = 2.01 0 0.000822 0.001106 0.001085 (θc=0.7) 0.2 0.000736 0.000984 0.000911 0.5 0.000635 0.000786 0.000730 0.8 0.000528 0.000586 0.000544
* All the standard deviations are with similar values around 0.0001.
Unlike the results of the Gaussian model, the estimations show a different
‧
story. At the first glance, there is strong difference of the estimated values of the dependence parameter (Table 5.3) in the auto-Poisson model. Because there is a totally different representation of the spatial dependence parameter, it is not possible to have the values like we expect. However, the impact of the local clusters is significantly large. As the RR gets higher, both the overall mean and global dependence get large. In addition, it is noticeable that the overall intensity rate gets smaller when the global dependence gets higher. On the other hand, the estimations of auto-Poisson model cannot maintain the consistent results as the RR changes unlike the results of the Gaussian CAR model (Table 5.1). For example, the estimated dependence parameter for the “Real” model is 0.0396 when RR = 1 and ρ = 0.5, but it downwards to 0.0290 when RR = 2.01 and ρ = 0.5 (Table 5.3).
Table 5.3. The estimations of spatial dependence φ in the auto-Poisson model
ρ Null Real EM
RR = 1 0 -0.000184 -0.000184 -0.000357 0.2 0.014299 0.014299 0.012330 0.5 0.039622 0.039622 0.036892 0.8 0.060444 0.060444 0.058834 RR = 1.49 0 0.005321 -0.004537 -0.000636 0.2 0.020734 0.009674 0.015762 0.5 0.042471 0.031885 0.037335 0.8 0.061714 0.058868 0.059028 RR = 2.01 0 0.026178 -0.003487 -0.001532 0.2 0.036075 0.007416 0.014898 0.5 0.049572 0.028998 0.036122 0.8 0.065239 0.055513 0.062479
* All the standard deviations are around 0.01 to 0.015.