2.4. SIMULATION RESULTS OF THE SATSCAN 21
Table 2.1. Rejection probability of the circular SaTScan
Frequencies of Clusters
ρ Rejection Probability 0 1 ≥2
0 0.02 98 2 0
0.2 0.13 87 13 0
0.5 0.27 73 25 2
0.8 0.61 39 55 6
Notes: The significance level is 0.05. Frequency of clusters represents the number of times that clusters detected by the SaTScan in 100 simulation runs.
Figure 2.1 shows the frequencies of cells being identified as clustered cells, in which darker areas indicate higher frequencies, and it can be used to explain why the SaTScan can detect false clusters. It is obvious that the false detected cells spread wider in the study region when the autocorrelation is higher. Also, taking the case of ρ = 0.8 for example, the darker cells gather around the centered area. Obviously, the rejection probabilities of cells apparently are not randomly occurred, and they are likely to be caused by the global autocorrelation.
We also check the performance of SaTScan’s cluster detection when both the autocorrelation and cluster are present. Assume that there is a cluster of size 3 × 3 located at the center region. We let the θc as the mean of clustered cells in equation (2.2) whose values are 0.4, 0.7, and 1. That is, the clustered cells have higher relative risk with the mean equal to 1.49, 2.01, and 2.72. In order to show a clearer presentation, some clear-defined measurements other than the testing power are also used to evaluate the accuracy of detection
‧
Figure 2.1. Images of cluster detection results of the SaTScan in case of autocorrelation without clusters. Each figure,in which the labels denote the times of being specified as clustered cells in 100 simulations, shows the result of different value of ρ.
results. First, we shall define the terms of true positive, false positive, true negative, and false negative. True positive (TP) cells represent the true clustered cells are correctly detected as clusters; false positive (FP) cells represent the usual cells are incorrectly detected as clusters; true negative (TN) cells represent the usual cells are not identified as clusters; false negative
‧
2.4. SIMULATION RESULTS OF THE SATSCAN 23
(FN) cells represent the true clustered cells are not identified as clusters.
The following values are used to measure the testing performance. First, the power, as its usual definition, is the power to reject the null hypothesis.
However, as mentioned in the previous section, the power can not distinguish whether there are local clusters and/or spatial dependence. To check if the power conveys the true information of the local cluster, we define the false alarm as the number of detected clusters which do not include any true positive cell. This measure can show if the identified clusters are real clusters or not. To check if the method can identify all clustered cells, the sensitivity, defined as TP/(TP+FN), is used to measure the proportion of identified clustered cells among all true clustered cells. The positive predictive value (PPV), defined as TP/(TP+FP), is used to measure the proportion of true clustered cells among the identified clustered cells. The specificity is not included in this study since the number of clustered cells is small (9 out of 400) and the specificity is always high no matter what methods are applied.
Figure 2.2 shows the preceding measures of the SaTScan under differ-ent RR values, and appardiffer-ently the SaTScan has better performance in all measures when the RR becomes larger. However, when there exists stronger global autocorrelation, these measures reveal different information. Take the case of RR = 1.49 (θc = 0.4) as an example. As the autocorrelation in-creases, the power becomes higher but the false alarm goes up as well. This phenomenon indicates that the SaTScan might detect more than one cluster, similar to those in Table 2.1. Also, the sensitivity and PPV show that the SaTScan does not have good performances in the cases of a small RR (1.49) and the larger autocorrelation worsens the results. In other words, the cluster
‧
detection of the SaTScan is obviously influenced by spatial autocorrelation.
However, if the RR is large (2.01 or 2.72), spatial autocorrelation does not have a large impact on the performance of the SaTScan with respect to the PPV and the false alarm. For example, for ρ = 0.8, the PPV is just 0.1916 for RR = 1.49, raises to 0.6088 for RR = 2.01, and is almost perfect for RR
= 2.72.
Figure 2.2. Detection results of SaTScan in case of autocorrelation and a cluster. Each figure shows the different measurement in the different combi-nations of autocorrelation and the RR.
‧
2.4. SIMULATION RESULTS OF THE SATSCAN 25
0 5 10 15 20
05101520
RR = 1.49, ρ = 0
x
y
0 1−5 6−10 11−15 16−20 >20
0 5 10 15 20
05101520
RR = 1.49, ρ = 0.2
x
y
0 1−5 6−10 11−15 16−20 >20
0 5 10 15 20
05101520
RR = 1.49, ρ = 0.5
x
y
0 1−5 6−10 11−15 16−20 >20
0 5 10 15 20
05101520
RR = 1.49, ρ = 0.8
x
y
0 1−5 6−10 11−15 16−20 >20
Figure 2.3. Images of cluster detection results of the SaTScan in case of autocorrelation and a 3 × 3 cluster of RR = 1.49 in the center of the study region. Each figure, in which the labels denote the times of being specified as clustered cells in 100 simulations, shows the detection result of different value of ρ.
Similar to Figure 2.1 (no clusters), the image plots of SaTScan’s detection results (Figure 2.3) shows the result of one cluster with RR = 1.49 and varying with different autocorrelation values. The darker cells, gathering
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
around the real cluster at the center, indicate that the cells closer to the real cluster have higher probabilities being identified as clustered cells. Also, as the autocorrelation becomes higher, the area of darker cells spreads wider.
For the case of ρ = 0.8, the area being identified as clustered cells at least 10 times (out of 100) is about half of the study area. In other words, the accuracy of the SaTScan decreases and the false alarm increases as the autocorrelation increases. This suggests that we shall be cautious about the interpretation of SaTScan’s detection results when there exists autocorrelation. Next, we will investigate the other extreme whether a local cluster can affect the estimate of spatial autocorrelation in a CAR model.
2.5 Performances of a CAR model when a cluster is present
In the previous discussion, we found that the results of SaTScan’s cluster detection can be misleading if there exists autocorrelation. We now show that the autocorrelation estimate can also be influenced by the cluster. Suppose a 3 × 3 cluster is located at the center and there is no spatial autocorrelation.
Table 2.2 lists the probability of discovering significant autocorrelation and the average of autocorrelation estimates, with respect to different values of RRs. The RRs are chosen like we did in previous section. The estimates are determined by the “spautolm” function in R package “spdep” and the results are again based on 100 simulation runs.