建議 - 結論與建議 - 複迴歸係數排列檢定方法探討

第六章結論與建議

第二節建議

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

第二節建議

在現實的情況中，資料的架構往往較為複雜，因此無法透過模擬完整呈現真實的樣貌。本研究的模擬僅針對給定的條件下進行分析，若需進一步探討係數統計量的可行性，則有以下幾點建議：

(1) 可進一步藉由模擬的方式，尋找各排列方法於樣本數多大的情況下適合使用之基準點。雖於表 5.5 有大致統整適用時機，但僅描述大略概況，並無明確指出適用之樣本數大小基準為何。

(2) 在實際的資料中，解釋變數之間往往存在一定的相關性，於本文中僅考量解釋變數間不相關以及高度相關兩種情況。因此可更深入探討解釋變數間不同相關程度之情形。此外，本文之解釋變數僅以均勻分配隨機產生，亦可改變產生解釋變數之特定分配，並考量不同解釋變數個數下之情形。因此可觀察各種排列方法的檢定結果是否會因為以上所提及的情境而有所差異。

(3) 我們亦可觀察產生誤差分配之其他特定分配，如考量誤差分布呈現較均勻，

抑或是過度集中的情形。

(4) 本文在探討統計量可行性時，是以型一誤差機率值是否介於 95%信賴區間內作為判斷準則。然而使用不同統計量所得之型一誤差機率值本身就有所差異，

在該基準之下判斷統計量之檢定力大小或許有失公允。或許可嘗試將各統計量之型一誤差機率值調整至同一基準後，再進行檢定力大小之比較。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

參考文獻

1. Anderson, M. J. (2001). Permutation tests for univariate or multivariate analysis of variance and regression. Canadian Journal of Fisheries and Aquatic Sciences, 58, 626-639.

2. Anderson, M. J., and Legendre, P. (1999). An empirical comparison of permutation methods for tests of partial regression coefficients in a linear model. Journal of Statistical Computation and Simulation, 62, 271-303.

3. Anderson, M. J., and Robinson, J. (2001). Permutation tests for linear models.

Australian and New Zealand Journal of Statistics, 43, 75-88.

4. Freedman, D., and Lane, D. (1983). A nonstochastic interpretation of reported significance levels. Journal of Business and Economic Statistics, 1, 292-298.

5. Kennedy, P. E. (1995). Randomization tests in econometrics. Journal of Business and Economic Statistics, 13, 85-94.

6. Kennedy, P. E., and Cade, B. S. (1996). Randomization tests for multiple regression.

Communications in Statistics – Simulation and Computation, 25, 923-936.

7. Levin, B., and Robbins, H. (1983). Urn models for regression analysis, with applications to employment discrimination studies. Law and Contemporary Problems, 46, 247-267.

8. Manly, B. F. J. (1991). Randomization, Bootstrap and Monte Carlo methods in biology (First Edition). London: Chapman and Hall.

9. Manly, B. F. J. (1997). Randomization, Bootstrap and Monte Carlo methods in biology (Second Edition). London: Chapman and Hall.

10. Manly, B. F. J. (2006). Randomization, Bootstrap and Monte Carlo methods in biology (Third Edition). London: Chapman and Hall.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

11. Oja, H. (1987). On permutation tests in multiple regression and analysis of covariance problems. Australian Journal of Statistics, 29, 91-100.

12. O’Gorman, T. W. (2005). The performance of randomization tests that use permutations of independent variables. Communication in Statistics – Simulation and Computation, 34, 895-908.

13. ter Braak, C. J. F. (1992). Permutation versus bootstrap significance tests in multiple regression and ANOVA. Bootstrapping and Related Techniques (K.-H. Jockel, G.

Rothe and W. Sendler, Eds.), Berlin: Springer Verlag, 79-86.

14. Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., and Nichols, T. E.

(2014). Permutation inference for the general linear model. NeuroImage, 92, 381-397.

‧

### Suppose 5 Covariable ###

### Set Seed at 806 ###

set.seed(806)

mu=1; num.datasets=10000; permu=999; n=10; B1=1;B2=1;B3=1;B4=1;B5=0; p=6

### Generate Data ###

a=list();

for ( j in 1:num.datasets){

x1=runif(n,0,3) #從 U(0,3)產生資料

y=mu+B1*x1+B2*x2+B3*x3+B4*x4+B5*z+error #residual

resid.reduced=summary(lm(y~x1+x2+x3+x4))$residuals #計算殘差𝑢_𝑖 resid.z=summary(lm(z~x1+x2+x3+x4))$residuals #計算殘差𝑣_𝑖

a[[j]]=cbind(y,x1,x2,x3,x4,z,resid.reduced,resid.z) }

### Permutation for Calculating p-value of Kennedy's Method ###

permutation.kennedy=function(a){

‧

t.ref=summary(lm(y~x1+x2+x3+x4+z))$coefficients[6,3] #計算觀察到的統計量

for(i in 1:permu){

#kennedy 不同自由度的排列分配

resid.permuted=sample(resid.reduced,n,replace=F) lm.k=lm(resid.permuted~resid.z)

s.b2=sqrt((sum((residuals(lm.k))^2)/(n-p))/(sum((resid.z-mean(resid.z))^2))) t.k1[i]=coefficients(lm.k)[2]/s.b2

### Permutation for Permutation Distribution of Kennedy's Method ###

permutation.kennedy=function(a){

t.k1=NULL; t.k2=NULL y=a[,1]

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

x1=a[,2]

x2=a[,3]

x3=a[,4]

x4=a[,5]

z=a[,6]

resid.reduced=a[,7]

resid.z=a[,8]

t.ref=summary(lm(y~x1+x2+x3+x4+z))$coefficients[6,3]

for(i in 1:permu){

#kennedy

resid.permuted=sample(resid.reduced,n,replace=F) lm.k=lm(resid.permuted~resid.z)

s.b2=sqrt((sum((residuals(lm.k))^2)/(n-p))/(sum((resid.z-mean(resid.z))^2))) t.k1[i]=coefficients(lm.k)[2]/s.b2

t.k2[i]=summary(lm.k)$coefficients[2,3]

}

cbind(c(t.ref,t.k1),c(t.ref,t.k2)) }

p.kennedy.t1=lapply(a[1],function(x) permutation.kennedy(x)) tvalue2=matrix(unlist(p.kennedy.t1),ncol=2)

hist(tvalue2[,1],prob=T,main="t-value with df=n-p (n=10) ",xlab="",breaks=50) v=seq(-4,4,length=100)

lines(v,dt(v,df=n-p),lwd=2)

legend("topright",c("t(n-p)"),lty=1,lwd=2)

hist(tvalue2[,2],prob=T,main="t-value with df=n-2 (n=10) ",xlab="",breaks=50) v=seq(-4,4,length=100)

lines(v,dt(v,df=n-2),lwd=2)

legend("topright",c("t(n-2)"),lty=1,lwd=2)

‧

### Simulate 1000 Dataset and each Permute 999 times, Do 100 times ###

set.seed(806)

mu=5; num.datasets=1000*100; permu=999 n=20;B1=2;B2=0

### Generate Data 產生資料 ###

a=list();

for ( j in 1:num.datasets){

x=runif(n,0,3) #從 U(0,3)產生亂數

### Permutation for Calculating p-value of Oja's Method ###

permutation=function(a){

t.ref=summary(lm(y~x+z))$coefficients[3,3] #觀察到的統計量 t.ref.scale=summary(lm(y~x+z.scale))$coefficients[3,3]

for(i in 1:permu){

#oja 排列分配

z.o.permuted=sample(z,n,replace=F)

t.o[i]=summary(lm(y~x+z.o.permuted))$coefficients[3,3]

z.o.scale.permuted=sample(z.scale,n,replace=F)

t.o.scale[i]=summary(lm(y~x+z.o.scale.permuted))$coefficients[3,3]

}

p.o.t=sum(abs(t.o)>=abs(t.ref))/(1+permu)

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

p.o.t.scale=sum(abs(t.o.scale)>=abs(t.ref.scale))/(1+permu) list(p.o.t,p.o.t.scale)

}

### Calculate p-value of each Dataset ###

pval.oja1=lapply(a[1:10000],function(x) permutation(x)) pval.oja2=lapply(a[10001:20000],function(x) permutation(x)) pval.oja3=lapply(a[20001:30000],function(x) permutation(x)) pval.oja4=lapply(a[30001:40000],function(x) permutation(x)) pval.oja5=lapply(a[40001:50000],function(x) permutation(x)) pval.oja6=lapply(a[50001:60000],function(x) permutation(x)) pval.oja7=lapply(a[60001:70000],function(x) permutation(x)) pval.oja8=lapply(a[70001:80000],function(x) permutation(x)) pval.oja9=lapply(a[80001:90000],function(x) permutation(x)) pval.oja10=lapply(a[90001:100000],function(x) permutation(x))

pval.oja.all=matrix(unlist(c(pval.oja1,pval.oja2,pval.oja3,pval.oja4,pval.oja5,pval.oja 6,pval.oja7,pval.oja8,pval.oja9,pval.oja10)),byrow=T,ncol=2)

### Calculate Type I Error ###

typeI=list();

for (i in 1:100){

typeI[[i]]=apply(pval.oja.all[((1+(i-1)*1000):(i*1000)),],2,function(x) sum(x<0.05)/length(x))

}

typeI.new=matrix(unlist(typeI),byrow=T,ncol=2) apply(typeI.new,2,mean)

apply(typeI.new,2,sd)

‧

### Using seed 806、120、512 for n=10; 807、121、513 for n=30； 808、122、

514 for n=50 ###

R=matrix(c(1,rho.x.z,rho.x.z,1),ncol=2,dimnames=list(c("X","Z"),c("X","Z"))) P=eigen(R)$vectors

‧

resid.reduced.permuted=apply(matrix(residuals(reduced), ncol=permu,nrow=n), 2, sample)

y.permuted.reduced=resid.reduced.permuted+predict(reduced) a[[j]]=cbind(y.permuted.reduced,cbind(y,x,z))

}

else if ( method == "Kennedy" ) { reduced=lm(y~x)

reduced.z=lm(z~x)

resid.reduced.permuted=apply(matrix(residuals(reduced), ncol=permu,nrow=n), 2, sample)

a[[j]]=cbind(resid.reduced.permuted,cbind(residuals(reduced.z),y,x,z)) }

else if ( method == "Levin and Robbins" ) { reduced=lm(y~x)

resid.reduced.permuted=apply(matrix(residuals(reduced), ncol=permu,nrow=n), 2, sample)

a[[j]]=cbind(resid.reduced.permuted,cbind(z,x,y)) }

else if ( method == "ter Braak" ) { full=lm(y~x+z)

resid.full.permuted=apply(matrix(residuals(full), ncol=permu,nrow=n), 2, sample)

##### n=10,mu=1,B1=1,B2=0 #####

n=10 ;num.datasets=1000; permu=999

‧

data.freedman=my.data(n,mu,B1,B2,"Freedman and Lane")

# using t #

‧

data.levin=my.data(n,mu,B1,B2,"Levin and Robbins")

# using t #

‧

### Calculate Committing Type I Error Times ###

p.value.1.0.1=cbind(p.theory.t.10.1.0,p.theory.t.30.1.0,p.theory.t.50.1.0,

‧

(p.value.1.0.2 using seed 806 for n=10; 807 for n=30; 808 for n=50 ) (p.value.1.0.3 using seed 120 for n=10; 121 for n=30; 122 for n=50 ) type1.1=apply(p.value.1.0.1,2,function(x) sum(x<0.05))

Size of Sample, n",ylab="",main="")

axis(side=1,at=c(10,30,50),tck=-0.02,lwd=1)

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

x1=c(10,30,50)

plot(x,type1,xlim=c(5,55),ylim=c(0.03,0.06),type="n",xaxt="n",yaxt="n",bty="l",xlab

="Size of Sample, n",ylab="",main="") axis(side=1,at=c(10,30,50),tck=-0.02,lwd=1)

axis(side=2,at=c(0.03,0.035,0.04,0.045,0.05,0.055,0.06),tck=-0.02,lwd=1,las=2) points(x1,type1[1:3],pch=0)

points(x1,type1[4:6],pch=1) points(x1,type1[10:12],pch=2) points(x1,type1[16:18],pch=3) points(x1,type1[22:24],pch=4) points(x1,type1[28:30],pch=5) points(x1,type1[34:36],pch=6)

lower=0.05-1.96*sqrt(0.05*0.95/3000) upper=0.05+1.96*sqrt(0.05*0.95/3000) abline(h=lower,lty=2,col="red")

abline(h=upper,lty=2,col="red") dev.off()

### b2 sign table ###

par(mar=c(0, 0, 0, 0)) plot.new()

legend("center",horiz=T,c("F & L","Kennedy","L & R","Manly","Oja","ter Braak"),text.width=0.1,pch=0:5)

### t sign table ###

par(mar=c(0, 0, 0, 0)) plot.new()

legend("center",horiz=T,c("theory t","F & L","Kennedy","L & R","Manly","Oja","ter Braak"),text.width=0.1,pch=0:6)

在文檔中複迴歸係數排列檢定方法探討 - 政大學術集成 (頁 46-60)

建議

第六章 結論與建議

第二節 建議

國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

第二節 建議

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

參考文獻

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧

‧

‧

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

第六章結論與建議

第二節建議

立政治大學

第二節建議

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學