• 沒有找到結果。

White Noise Regressors

In this section, we consider explanatory variables generated independently from Gaussian white noise processes of (5.7).

Proposition 6 Consider a class of models given by (3.1) with xj(s)’s independently gen-erated from white-noise processes of (5.7) and cov(η(s), η(s0)) = σ2ηexp(−κη|s−s0|), where

and A(α; θ) is defined in (3.6).

(ii) For δ = 0,

−2`(θ; α) = n log(2π) − 1 − δ

2 log n + µ

log σ²2+ P

j∈αcβj2σj2+ σ2²,0 σ²2

n

+

µηση2 σ²2

1/2µ 1 −

P

j∈αcβj2σj2+ σ²,02

²2 +κη,0ση,02 ηση2

n1/2

(3)(α; θ) + Op(1). (5.126)

By (5.124), for δ ∈ (0, 1) and any α ∈ A \ Ac,

−2`(θ; α) + 2`(θ; αc) = σ²−2 X

j∈αc

βj2σj2n + ξ(3)(α; θ) − ξ(2)c; θ) + op(n)

= σ²−2 X

j∈αc

βj2σj2n + op(n), (5.127)

where the last equality holds because by (5.125), ξ(3)(α; θ) − ξ(3)c; θ) = op(n). Similarly by (5.126) for δ = 0,

−2`(θ; α) + 2`(θ; αc) = σ−2² X

j∈αc

βj2σj2n + op(n).

As to be demonstrated in Theorem 12, we can use (5.127) to find an appropriate penalty λ that leads to selection consistency.

The following lemma shows that σ²2 is over-estimated by ML asymptotically when α ∈ A\Acunder both the fixed domain and the increasing domain asymptotic frameworks.

Lemma 13 Under the setup of Proposition 6, let Θ ⊂ (0, ∞)3 be a compact set and let ˆθ(α) = (ˆση2(α), ˆκη(α), ˆσ²2(α))0 be the ML estimate of θ based on model α. Then for δ ∈ [0, 1) and α ∈ A,

ˆ

σ2²(α) = X

j∈αc

βj2σ2j + σ²,02 + op(1), (5.128) ˆ

κη(α)ˆσ2η(α) = κη,0ση,02 + op(1). (5.129) The following theorem further provides the convergence rates for the ML estimates of κη, σ2η and σ²2. These results are keys for establishing some asymptotic properties of GIC in Theorem 13.

Theorem 11 Under the setup of Proposition 6, let Θ ⊂ (0, ∞)3 be a compact set and let θ(α) = (ˆˆ σ2η(α), ˆκη(α), ˆσ2²(α))0 be the ML estimate of θ based on model α. Then

(i) For δ ∈ (0, 1), ˆ

σ²2(α) =

½ σP²,02 + op(n−(1−δ)/2); if α ∈ Ac,

j∈αcβj2σ2j + σ²,02 + op(n−(1−δ)/2); if α ∈ A \ Ac,(5.130) ˆ

κη(α)ˆση2(α) = κη,0ση,02 + op(n−(1−δ)/4), (5.131) ˆ

ση2(α) = ση,02 + op(1); α ∈ A, (5.132) ˆ

κη(α) = κη,0+ op(1); α ∈ A. (5.133)

(ii) For δ = 0, Then, it can be obtained in a way similar to (5.99) that

ξ(3)(θ; α) = ξ1(θ; α) + ξ2(θ; α) + ξ(θ) + Op(1). (5.138) First, we prove (5.130). By (5.128) and (5.129), it suffices to show that for |σ²2−σ2²,α| = o(1), |κησ2η − κη,0ση,02 | = o(1), and any ε > 0,

2²−σ²,α2 |≥εninf−(1−δ)/2(−2`(θ; α) + 2`((ση2, κη, σ²,α2 )0; α)) > 0, (5.139) as n → ∞ with probability tending to 1. By (5.124),

−2`(θ; α) = n log(2π) − 1 − δ

where the second equality follows from |σ2² − σ2²,α| = o(1), (5.39) and (5.40), and the last equality follows form (5.67) and

ξ1(θ; α) − ξ1((ση2, κη, σ²,α2 )0; α) = op(max((σ²2− σ²,α2 )2n, nδ)), (5.141) ξ2(θ; α) − ξ2((ση2, κη, σ²,α2 )0; α) = op(max((σ²2− σ²,α2 )2n, nδ)), (5.142) which can be obtained in a way similar to (5.105)-(5.106) where the moment conditions are given from (5.43)-(5.44) in this case. Thus, (5.139) is obtained. This completes the proof of (5.130).

Second, we prove (5.131). By (5.129) and (5.130), it suffices to show that for |σ2² σ2²,α| = o(n−(1−δ)/2), |κηση2− κη,0ση,02 | = o(1) and any ε > 0,

η2κη−κη,0σinf2η,0|≥εn−(1−δ)/4

¡− 2`(θ; α) + 2`((ση,02 , κη,0, σ2²,α)0; α)¢

> 0, (5.143) as n → ∞ with probability tending to 1. By (5.140), we have for |σ2²− σ2²,α| = o(n−(1−δ)/2) and |κησ²2− κη,0ση,02 | = o(1),

−2`(θ; α) + 2`((ση,02 , κη,0, σ²,α2 )0; α)

=

½µηση2 σ2²,α

1/2µ 1

2 +κη,0ση,02 ηση2

(2κη,0ση,02 )1/2 σ²,α

¾

n(1+δ)/2+ 1

η,αη− κη,α)2nδ 1(θ; α) − ξ1((ση,02 , κη,0, σ²,α2 )0; α) + ξ2(θ; α) − ξ2((σ2η,0, κη,0, σ²,α2 )0; α)

+ξ(θ) − ξ((σ2η,0, κη,0, σ²,α2 )0) + op(nδ)

= ηση2− κη,0σ2η,0)2n(1+δ)/2 25/2η,0ση,02 )3/2 + 1

η,0

η − κη,0)2nδ+ ξ1(θ; α) − ξ1((σ2η,0, κη,0, σ²,α2 )0; α) 2(θ; α) − ξ2((ση,02 , κη,0, σ²,α2 )0; α) + ξ(θ) − ξ((ση,02 , κη,0, σ2²,α)0)

+o((κηση2− κη,0σ2η,0)2n(1+δ)/2) + op(nδ)

= ηση2− κη,0σ2η,0)2n(1+δ)/2

25/2η,0ση,02 )3/2 +η − κη,0)2nδ

η,0 + o((κησ2η− κη,0σ2η,0)2n(1+δ)/2)

+op(nδ), (5.144)

where the first equality follows from |σ²2− σ2²,α| = o(n−(1−δ)/2), the second equality follows from (5.41), and the last equality follows from

ξ1(θ; α) − ξ1((σ2η,0, κη,0, σ²,α2 )0; α) = op(max((κηση2− κη,0ση,02 )2n(1+δ)/2, nδ)),(5.145) ξ2(θ; α) − ξ2((σ2η,0, κη,0, σ²,α2 )0; α) = op(max((κηση2− κη,0ση,02 )2n(1+δ)/2, nδ)),(5.146) ξ(θ) − ξ((ση,02 , κη,0, σ2²,α)0) = op(max((κηση2− κη,0ση,02 )2n(1+δ)/2, nδ)),(5.147) which can be obtained in a way similar to (5.109)-(5.111). Thus, (5.143) is obtained. This completes the proof of (5.131).

Third, we prove (5.132) and (5.133). By (5.144), we have for |σ2²− σ2²,α| = o(n−(1−δ)/2),

2ηκη − κη,0ση,02 | = o(n−(1−δ)/4) and any ε > 0,

η−κinfη,0|≥ε−2`(θ; α) + 2`((ση,02 , κη,0, σ²,α2 )0; α) = 1 η,0

ε2nδ+ op(nδ) > 0,

as n → ∞ with probability tending to 1, which gives (5.133). This together with (5.131) gives (5.132).

Fourth, we prove (5.134). By (5.128) and (5.129), it suffices to show that for |σ²2 σ2²,α| = o(1), |κηση2− κη,0ση,02 | = o(1), there exists M > 0 such that

²2−σ2²,αinf|≥M n−1/2(−2`(θ; α) + 2`((ση2, κη, σ2²,α)0; α)) > 0, (5.148) as n → ∞ with probability tending to 1. By (5.126) and (5.138),

−2`(θ; α) = n log(2π) − 1 − δ

2 log n + µ

log σ²2+ σ2²,α σ²2

n

+

µηση2 σ2²

1/2µ

1 −σ²,α2

2² +κη,0σ2η,0 ησ2η

n1/2

1(θ; α) + ξ2(θ; α) + ξ(θ) + Op(1). (5.149) Then, for |σ²2− σ2²,α| = o(1) and |κηση2− κη,0σ2η,0| = o(1), we have

−2`(θ; α) + 2`((ση2, κη, σ²,α2 )0; α)

= µ

log σ²2+σ²,α2

σ²2 − log σ²,α2 − 1

n

+

½µηση2 σ2²

1/2µ

1 −σ²,α2

²2 +κη,0ση,02 ησ2η

µηση2 σ²,α2

1/2µ 1

2+ κη,0ση,02 ησ2η

¶¾ n1/2 1(θ; α) − ξ1((σ2η, κη, σ2²,α)0; α) + ξ2(θ; α) − ξ2((σ2η, κη, σ2²,α)0; α)

+ξ(θ) − ξ((ση2, κη, σ²,α2 )0) + Op(1)

= 1

²,α4 ²2− σ2²,α)2n + ξ1(θ; α) − ξ1((ση2, κη, σ²,α2 )0; α) + ξ2(θ; α) − ξ2((σ2η, κη, σ²,α2 )0; α) +ξ(θ) − ξ((ση2, κη, σ²,α2 )0) + op((σ²2− σ²,α2 )2n) + Op(1)

= 1

²,α4 ²2− σ2²,α)2n + op((σ2² − σ²,α2 )2n) + Op(1),

where the second equality follows from (5.39) and (5.40) with σ = σ²,α and τ = κη,0σ2η,0, and the last equality follows from (5.73) and

ξ1(θ; α) − ξ1((ση2, κη, σ²,α2 )0; α) = op((σ2² − σ²,α2 )2n) + Op(1), ξ2(θ; α) − ξ2((ση2, κη, σ²,α2 )0; α) = op((σ2² − σ²,α2 )2n) + Op(1),

which can be obtained in a way similar to (5.105)-(5.106). Consequently, there exists M > 0 such that

inf

²2−σ²,α2 |≥M n−1/2(−2`(θ; α) + 2`((σ2η, κη, σ2²,α)0; α)) = M2

²,α4 ε2+ Op(1) > 0,

as n → ∞ with probability tending to 1. Thus, (5.148) is obtained. This completes the proof of (5.134).

Finally, we prove (5.135). By (5.129) and (5.134), it suffices to show that for |σ²2 σ2²,α| = O(n−1/2), |κησ2η − κη,0ση,02 | = o(1) and there exist M > 0 such that

2ηκη−κη,0infση,02 |≥M n−1/4

¡− 2`(θ; α) + 2`((σ2η,0, κη,0, σ²,α2 )0; α)¢

> 0, (5.150)

as n → ∞ with probability tending to 1. By (5.149), for |σ2² − σ2²,α| = O(n−1/2) and

ηση2− κη,0ση,02 | = o(1), we have

−2`(θ; α) + 2`((ση,α2 , κη,α, σ2²,0)0; α)

=

½µηση2 σ2²,α

1/2µ 1

2 +κη,0ση,02 ηση2

µη,0σ2η,0 σ²,α2

¶¾ n1/2

1(θ; α) − ξ1((ση,02 , κη,0, σ²,α2 )0; α) + ξ2(θ; α) − ξ2((σ2η,0, κη,0, σ²,α2 )0; α) +ξ(θ) − ξ((σ2η,0, κη,0, σ²,α2 )0) + Op(1)

= ηση2− κη,0ση,02 )2n1/2

25/2η,0ση,02 )3/2 + ξ1(θ; α) − ξ1((ση,02 , κη,0, σ2²,α)0; α) 2(θ; α) − ξ2((ση,02 , κη,0, σ²,α2 )0; α) + ξ(θ) − ξ((ση,02 , κη,0, σ2²,α)0) +o¡

ησ2η − κη,0ση,02 )2n1/2¢

+ Op(1)

= ηση2− κη,0ση,02 )2n1/2 25/2η,0ση,02 )3/2 + o¡

ησ2η− κη,0σ2η,0)2n1/2¢

+ Op(1), (5.151) where the first equality follows from |σ2² − σ²,α2 | = O(n−1/2), the second equality follows from (5.41), and the last equality follows from

ξ1(θ; α) − ξ1((ση,02 , κη,0, σ2²,α)0; α) = op((κησ2η− κη,0ση,02 )2n1/2) + Op(1),

ξ2(θ; α) − ξ2((ση,02 , κη,0, σ2²,α)0; α) = op((κησ2η− κη,0ση,02 )2n1/2) + Op(1), (5.152) ξ(θ) − ξ(ση,02 , κη,0, σ²,α2 )0) = op((κησ2η− κη,0ση,02 )2n1/2) + Op(1), (5.153) which can be obtained in a way similar to (5.109)-(5.111). Thus, (5.150) is obtained. This

completes the proof of (5.135). 2

Corollary 6 Under the setup of Theorem 11, let θ(3)α =

µ

ση,02 , κη,0, X

j∈αc

βj2σ2j + σ²,02

. (5.154)

Then for `(θ; α) defined in (2.9), plim

n→∞

1

nδ(−2`( ˆθ(α); α) + 2`(θα(3); α)) = 0; if δ ∈ (0, 1), (5.155)

−2`( ˆθ(α); α) + 2`(θ(3)α ; α) = Op(1); if δ = 0. (5.156) In addition, for LKL(θ; α) defined in (3.3),

plim

n→∞LKL( ˆθ(α); α)±

LKLα(3); α)) = 0; if δ ∈ [0, 1). (5.157)

Note that from Theorem 11, plim

n→∞

θ(α) = θˆ α(3) for δ ∈ (0, 1), which immediately im-plies (5.155). On the other hand, (5.156) is somewhat surprising, because ˆθ(α) generally does not converge to θα(2) for δ = 0. However, selection consistency and asymptotic loss efficiency are possible for geostatistical model selection even if some covariance parame-ters cannot be consistently estimated under the fixed domain asymptotic framework (see Theorem 13).

Theorem 12 Consider a class of models given by (3.1) with xj(s)’s independently gener-ated from white-noise processes of (5.7) and cov(η(s), η(s0)) = ση2exp(−κη|s − s0|), where Ac 6= ∅ and p is fixed. Suppose that ση2 > 0, κη > 0 and σ2² > 0 are known. In addi-tion, suppose that the data are collected at si = in−(1−δ) ∈ [0, nδ]; i = 1, . . . , n for some δ ∈ [0, 1). If λ → ∞ and λ/n → 0, then

LKLαGIC(λ)

minα∈ALKL(α)−→ 1,p as n → ∞.

In addition,

n→∞lim P¡ ˆ

αGIC(λ) = αc¢

= 1.

Proof. By Corollary 2, it suffices to show that

n→∞lim tr(Σ−1)/λ = ∞, (5.158)

which follows from (5.31) and λ = o(n). This completes the proof. 2 Theorem 13 Under the setup of Theorem 12, suppose that θ = (σ2η, κη, σ²2)0 is unknown, where Θ ⊂ (0, ∞)3 is a compact set such that θ0 ∈ Θ. Let ˆθ(α) be the ML estimate of θ based on model α. For δ ∈ [0, 1), if λ → ∞ and λ±

n → 0, then

n→∞lim P¡ ˆ

αGIC(λ) = αc¢

= 1.

Proof. For the consistency, it suffices to show that the conditions in Corollary 3 are satisfied with τn= n. First, for δ ∈ [0, 1),

µ0A(α; θ)0Σ−1(θ)A(α; θ)µ = β(αc\ α)0X(αc\ α)0Σ−1(θ)X(αc\ α)β(αc\ α)

−β(αc\ α)0X(αc\ α)0Σ−1(θ)M (α; θ)X(αc\ α)β(αc\ α)

= β(αc\ α)0X(αc\ α)0Σ−1(θ)X(αc\ α)β(αc\ α) + Op(1)

= X

j∈αc

βj2σ2jtr(Σ−1(θ)) + op(n)

= X

j∈αc

βj2σj2

σ²2 n + op(n), (5.159)

where the second equality is obtained in a way similar to (5.100), the third equality follows from

β(αc\ α)0X(αc\ α)0Σ−1(θ)X(αc\ α)β(αc\ α) = X

j∈αc

βj2σj2tr(Σ−1(θ)) + op(n),

which can be obtained by (5.35), Chebyshev’s inequality and using the following moment condition:

var(Xj0Σ−1(θ)Xj0) = σ2jσ2j0tr(Σ−2(θ)) = O(n),

and the last equality follows from (5.31). Hence, (A.1’) is satisfied. Second, by (5.31) and (5.35), we have for any θ ∈ Θ,

plim

n→∞

1

nX0Σ−1(θ)X = D(θ),

where D(θ) is a p × p diagonal matrix with diagonals σj2±

σ2², j = 1, . . . , p. Hence, (A.2’) holds. Third, by (5.34) and (5.35), (A.3’) holds. Fourth, by (5.155)-(5.157), (A.4) and (A.5) hold trivially for τn= n and θα(3) defined in (5.154). Fifth, for ξ(θ) defined in (5.53), by (5.147) and (5.153), we have for δ ∈ [0, 1),

plim

n→∞

1 n

¡ξ(θ0) − ξ(θ(3)α

= 0.

Hence, (4.12) holds. Last, for α ∈ Ac, θα(3) = θ0, (4.14) holds trivially. Then, for Ac6= ∅, λ → ∞ and λ = o(n), we have

n→∞lim P¡ ˆ

αGIC(λ) = αc¢

= 1,

which completes the proof. 2

Comparing among Theorems 7, 10 and 13, we see that GIC is easiest to be consistent when the variables to be selected are from white-noise processes, but is most difficult to be so when the variables to be selected are polynomials.

Chapter 6

Conditional Generalized Information Criterion

If we are interested to find the asymptotic optimal properties of (3.14) throughout some selection procedure, it is somehow difficult to prove the asymptotic properties directly from GIC we introduce above. Another criterion is needed. Vaida and Blanchard (2005) suggest a suitable criterion when we are interesting in spatial process prediction which is named conditional AIC (CAIC). Here we will also suggest a conditional generalized information criterion (CGIC) which includes CAIC as a special case. In the following sections, we are going to introduce the asymptotic theory of CGIC in geostatistical model selection problems.

6.1 Conditional Akaike’s Information Criterion

Consider the loss, L(α) defined in (3.14) with estimators, ˆS(α) defined in (3.15). It’s difficult to find the optimal properties of L(α) directly from the criterion (4.6). Vaida and Blanchard (2005) suggested a conditional AIC (CAIC) selection procedure for the linear mixed models which is an unbiased estimator of E(L(α)) shown in (3.17). They suggested when focus on the mean function estimate, the AIC in (4.3) is good to be a selection procedure. When focus on both the mean function estimate and the spatial process prediction, CAIC is much adequate than AIC to be a selection procedure. That is for α ∈ A,

ΓCAIC(α) = kZ − ˆS(α)k2+ 2tr(H(α))σ2², (6.1) where ˆS(α) = H(α)Z with H(α) defined in (3.16). Let

ˆ

αCAIC = arg min

α∈A ΓCAIC(α). (6.2)

Then we have the following theorem.

Theorem 14 Consider a class of models given by (3.1). Suppose that

n→∞lim X

α∈A

1

E(L(α)) = 0, (6.3)

where L(α) is defined in (3.14). Then the criterion ΓCAIC(α) defined in (6.1) is asymp-totically loss efficient:

plim

n→∞L(ˆαCAIC

α∈Ainf L(α) = 1.

Proof. Here, we first expand the CAIC defined in (6.1). It is ΓCAIC(α) = (Z − ˆS(α))0(Z − ˆS(α)) + 2σ2²tr(H(α))

= (S − ˆS(α) + ²)0(S − ˆS(α) + ²) + 2σ2²tr(H(α))

= L(α) + 2²0(S − ˆS(α)) + ²0² + 2σ²2tr(H(α))

= L(α) + 2²0(I − H(α))S − 2²0H(α)² + ²0² + 2σ2²tr(H(α))

= L(α) + 2σ²2²0−1A(α))S − 2²0H(α)² + ²0² + 2σ²2tr(H(α))

= L(α) + 2σ²2²0−1A(α))µ + 2σ2²²0−1A(α))η + ²0²

−2¡

²0H(α)² − σ²2tr(H(α))¢

, (6.4)

where the third equality follows from (3.14) and the second last equality follows from I − H(α) = A(α) − ΣηΣ−1A(α) = σ2²Σ−1A(α).

It then needs to show that for α ∈ A,

ΓCAIC(α) = ²0² + L(α) + op(L(α)), (6.5) which suffices to show that

plim

n→∞sup

α∈A

0−1A(α))µ|

E(L(α)) = 0, (6.6)

plim

n→∞sup

α∈A

0−1A(α))η|

E(L(α)) = 0, (6.7)

plim

n→∞sup

α∈A

0H(α)² − σ²2tr(H(α))|

E(L(α)) = 0, (6.8)

plim

n→∞sup

α∈A

¯¯

¯¯ L(α) E(L(α)) − 1

¯¯

¯¯ = 0. (6.9)

Hence, by (6.5), for ˆαCAIC defined in (6.2) and αL = arg minα∈AL(α), we can easily conclude that

ΓCAICαCAIC) = ²0² + L(ˆαCAIC) + op(L(ˆαCAIC)), ΓCAICL) = ²0² + L(αL) + op(L(αL)).

It follows that

0 ≤ ΓCAICL) − ΓCAICαCAIC)

L(ˆαCAIC) = L(αL) − L(ˆαCAIC)

L(ˆαCAIC) + op(1), and then

plim

n→∞

L(αL) − L(ˆαCAIC) L(ˆαCAIC) = 0, which gives plim

n→∞L(ˆαCAIC

α∈Ainf L(α) = 1.

Here, we start to prove (6.6)-(6.9) one by one. First, any ε > 0,

which gives (6.6), where the second last inequality follows from σ²4µ0A(α)0Σ−2A(α)µ ≤ E(L(α)), by (3.17) and the last equality follows from (6.3).

Second, for any ε > 0,

where the third inequality follows from σ2²Σ−1 ≤ I, the second last inequality follows from σ²2tr(Σ−1A(α)Ση) ≤ σ²2tr(ΣηΣ−1) ≤ E(L(α)),

by (3.17) and the last equality follows from (6.3).

Third, for any ε > 0,

where the second inequality is an application of Theorem 2 of Whittle (1960) for some c1 > 0, the third equality follows from

tr(H(α)H(α)0) = tr¡

ΣηΣ−1+ σ2²Σ−1M (α))(ΣηΣ−1+ σ²2Σ−1M (α))0¢

= tr¡

ΣηΣ−2Ση+ σ2²ΣηΣ−1M (α)0Σ−1+ σ²2Σ−1M (α)Σ−1Ση ²4M (α)0Σ−2M (α)¢

≤ tr(ΣηΣ−1) + 3σ2²tr(Σ−1M (α))

≤ 3σ²−2E(L(α)), by

tr(ΣηΣ−2Ση) = tr(ΣηΣ−1− σ2²Σ−2Ση)

≤ tr(ΣηΣ−1)

tr(σ²2ΣηΣ−1M (α)0Σ−1) = tr(σ²2tr(M (α)0Σ−1− σ²2Σ−1M (α)0Σ−1))

≤ tr(σ²2tr(M (α)0Σ−1)), and

σ4²tr(M (α)0Σ−2M (α)) ≤ σ²2tr(M (α)0Σ−1M (α)) = σ2²tr(Σ−1M (α)).

Last, it remains to show (6.9). Here, we first expand L(α) defined in (3.14). That is L(α) = (S − ˆS(α))0(S − ˆS(α))

= k(I − H(α))µ + (η − ΣηΣ−1(η + ²)) − σ²2Σ−1M (α)(η + ²)k2

= kσ²2Σ−1A(α)µ + (σ²2Σ−1η − ΣηΣ−1²) − σ2²Σ−1M (α)(η + ²)k2

= σ²4µ0A(α)0Σ−2A(α)µ + kσ2²Σ−1η − ΣηΣ−1²k2− 2σ²4µ0A(α)0Σ−2M (α)(η + ²) 4²(η + ²)0M (α)0Σ−2M (α)(η + ²) + 2σ²2µ0A(α)0Σ−1²2Σ−1η − ΣηΣ−1²)

−2σ2²²2Σ−1η − ΣηΣ−1²)0Σ−1M (α)(η + ²). (6.10) It then follows together with (3.17),

L(α) − E(L(α)) = kσ²2Σ−1η − ΣηΣ−1²k2− σ²2tr(ΣηΣ−1)

²4(η + ²)0M (α)0Σ−2M (α)(η + ²) − σ²4tr(Σ−1M (α))

+2σ2²µ0A(α)0Σ−12²Σ−1η − ΣηΣ−1²) − 2σ²4µ0A(α)0Σ−2M (α)(η + ²)

−2σ²22²Σ−1η − ΣηΣ−1²)0Σ−2M (α)(η + ²).

Then, to show (6.9), it suffices to show that plim

n→∞

sup

α∈A

¯¯kσ²2Σ−1η − ΣηΣ−1²k2− σ²2tr(ΣηΣ−1

¯

E(L(α)) = 0, (6.11)

plim

n→∞sup

α∈A

|(η + ²)0M (α)0Σ−2M (α)(η + ²) − tr(Σ−1M (α))|

E(L(α)) = 0, (6.12)

plim

n→∞

sup

α∈A

0A(α)0Σ−1²2Σ−1η − ΣηΣ−1²)|

E(L(α)) = 0, (6.13)

plim

n→∞sup

α∈A

0A(α)0Σ−2M (α)(η + ²)|

E(L(α)) = 0, (6.14)

plim

n→∞

sup

α∈A

|(σ2²Σ−1η − ΣηΣ−1²)0Σ−1M (α)(η + ²)|

E(L(α)) = 0. (6.15)

Now, we start to prove (6.11)-(6.15) one by one.

Hence, to show (6.11), it suffices to show plim

First, (6.16) can be established in a similar manner by Theorem 2 of Whittle. It is for any ε > 0,

for some c2 > 0, where the third inequality follows from

tr(ΣηΣ−2ΣηΣ−2) ≤ σ²−4tr(Σ2Σ−2) ≤ σ²−4tr(ΣηΣ−1), (6.19) by σ²2Σ−1≤ I and Σ1/2η Σ−1Σ1/2η ≤ I by Ση ≤ Σ, and the last equality follows from (4.5).

Second, (6.17) is also established by Theorem 2 of Whittle. It is for any ε > 0,

n→∞lim P

for some c3 > 0, where third inequality follows from where the third equality follows from

σ²2tr(Σ−1ΣηΣ−1ΣηΣ−1ΣηΣ−1) ≤ σ²2tr(Σ−1ΣηΣ−1ΣηΣ−1)

≤ σ²2tr(Σ−1ΣηΣ−1)

≤ tr(ΣηΣ−1),

by σ²2Σ−1 ≤ I and Σ−1/2ΣηΣ−1/2 ≤ I by Ση < Σ, and the last equality follows from (6.3). It then gives (6.11).

For (6.12), it can be established by Theorem 2 of Whittle. It is for any ε > 0,

n→∞lim P

for some c4 > 0, where the third inequality follows from

tr(ΣM (α)0Σ−2M (α)ΣM (α)0Σ−2M (α)) = tr(M (α)Σ−1M (α)Σ−1)

≤ tr(Σ−1M (α)Σ−1)

≤ σ−2² tr(Σ−1M (α)),

by M (α)ΣM0(α)Σ−1 = M (α), Σ−1M (α) ≤ Σ−1 and σ²2Σ−1 ≤ I, and the last equality follows from (6.3).

For (6.13), we have

0A(α)0Σ−12²Σ−1η − ΣηΣ−1²)| = |σ²2µ0A(α)0Σ−2η − µ0A(α)0Σ−1ΣηΣ−1²|

≤ |σ²2µ0A(α)0Σ−2η| + |µ0A(α)0Σ−1ΣηΣ−1²|.

Hence, to show (6.13), it suffices to show that plim

First, (6.20) can be show similarly from (6.6). It is for any ε > 0,

n→∞lim P where the third inequality follows from

µ0A(α)0Σ−2ΣηΣ−2A(α)µ ≤ µ0A(α)0Σ−3A(α)µ follows from (6.3). It then gives (6.13).

For (6.14), we have for any ε > 0, follows from (6.3). It then gives (6.14).

For (6.15), we have

Then, to show (6.15), it suffices to show that plim

Now, we start to show (6.22)-(6.25) one by one. First, (6.22) can be established by

Theorem 2 of Whittle. That is for any ε > 0,

for some c6 > 0, where the third inequality follows from

Σ−1M (α)ΣηM (α)0Σ−1 ≤ Σ−1M (α)ΣM (α)0Σ−1 = Σ−1M (α),

and the fourth inequality follows from σ2²Σ−1ΣηΣ−1 ≤ I by Ση ≤ Σ and σ2²Σ−1 ≤ I, and the last equality follows from (6.3). Second, (6.23) is similarly to (6.18). It is

n→∞lim P

equality follows from (6.3). Third, (6.24) is similar to (6.23). It is for any ε > 0,

where the third inequality follows from

Σ−1M (α)ΣηM (α)0Σ−1 ≤ Σ−1M (α)ΣM (α)0Σ−1 = Σ−1M (α),

and the fourth inequality follows from ΣηΣ−2Ση ≤ I by Σ2η ≤ Σ2, and the last equality follows from (6.3). Last, (6.25) can be established by Theorem 2 of Whittle. It is for any ε > 0,

for some c7 > 0, where the third equality follows from

σ2²Σ−1M (α)M (α)0Σ−1 ≤ Σ−1M (α)ΣM (α)0Σ−1 = Σ−1M (α),

and the fourth inequality follows from ΣηΣ−2Ση ≤ I by Σ2η ≤ Σ2, and the last equality follows from (6.3). Thus, we ends the proof of (6.9), which completes the proof. 2 Note that (6.3) holds in general. Here, we consider an example where (6.3) is satisfied.

Corollary 7 Consider a class of models given by (3.1) with p fixed and any arbitrary explanatory variables. Suppose that the data are collected at si = in−(1−δ) ∈ [0, nδ];

i = 1, . . . , n for some δ ∈ [0, 1). Consider the exponential covariance model of (5.1) for η(·). Let ˆαCAIC be the model selected by CAIC as defined in (6.2). Then,

plim

n→∞L(ˆαCAIC

α∈Ainf L(α) = 1.

Further, if Ac6= ∅, then for any model selection procedure ˆα, such that lim

n→∞P (ˆα ∈ Ac) = 1,

plim

n→∞L(ˆα)±

α∈Ainf L(α) = 1.

It is shown in (3.17) that E(L(α)) is lower bounded by dominated by σ²2tr(ΣηΣ−1) for α ∈ A, which is often a dominated term of E(L(α)). In addition, for α ∈ Ac, σ²2tr(ΣηΣ−1) is the dominated term of E(L(α)). Hence, it might suggests us that whatever correct model we select, it will be always satisfied the asymptotic loss efficiency. Further, in the following example, EL((α)) are dominated by σ2²tr(ΣηΣ−1) for α ∈ A. In such case, every candidate model achieves the asymptotic loss efficiency.

Corollary 8 Consider a class of models given by (3.1) with xj(s) = (sn−δ)j; j = 1, . . . , p, and cov(η(s), η(s0)) = σ2ηexp(−κη|s − s0|), where p fixed and Ac 6= ∅. Suppose that the data are collected at si = in−(1−δ) ∈ [0, nδ]; i = 1, . . . , n for some δ ∈ [0, 1). Let ˆαCAIC be the model selected by CAIC as defined in (6.2). Then

plim

n→∞L(ˆαCAIC

α∈Ainf L(α) = 1.

Further, for any model selection procedure ˆα, plim

n→∞L(ˆα)±

α∈Ainf L(α) = 1.

¿From (7) and (8), it might suggest us that the variable selection is somehow unnec-essary for the asymptotic loss efficiency of L(α) in those cases. Here, we consider the strongly asymptotic loss efficiency of L(α) defined in (3.21).

Theorem 15 Consider a class of models given by (3.1) and the universal kriging predictor S(α) of S defined in (3.15). Supposeˆ

n→∞lim X

α∈A\Ac

1

E(L(α)) − σ²2tr(ΣηΣ−1) = 0, (6.26) where L(α) is defined in (3.14). If |Ac| ≤ 1 and αc is fixed, then ˆαCAIC of (6.2) is strongly asymptotic loss efficient:

plim

n→∞

L(ˆαCAIC) − kS − E(S|Z)k2 infα∈AL(α) − kS − E(S|Z)k2 = 1.

Proof. Here, we first suppose that Ac = ∅. Now, we expand the CAIC defined in (6.1) from (6.4). It is

ΓCAIC(α) = L(α) + 2σ²2²0−1A(α))µ + 2σ2²²0−1A(α))η + ²0²

−2¡

²0H(α)² − σ²2tr(H(α))¢

= L(α) + 2σ²2²0−1A(α))µ + 2σ2²²0Σ−1η − 2σ²2²0M (α)η + ²0²

−2¡

²0ΣηΣ−1² − σ²2ΣηΣ−1¢

− 2σ²2¡

²0Σ−1M (α)² − σ2²tr(Σ−1M (α))¢ ,(6.27) where the last equality follows from H(α) = ΣηΣ−1+ σ²2Σ−1M (α) by (3.16). Note that ²2²0Σ−1η +²0²−2¡

²0ΣηΣ−1²−σ²2ΣηΣ−1¢

is constant in variable selection. It then needs to show that for α ∈ A \ Ac,

ΓCAIC(α) = constant + L(α) + op(L(α)), (6.28) where L(α) = L(α) − kS − E(S|Z)k2, which suffices to show that

plim

n→∞ sup

α∈A\Ac

0−1A(α))µ|

E(L(α)) = 0, (6.29)

plim

n→∞

sup

α∈A\Ac

0−1M (α))η|

E(L(α)) = 0, (6.30)

plim

n→∞ sup

α∈A\Ac

0Σ−1M (α)² − σ²2tr(Σ−1M (α))|

E(L(α)) = 0, (6.31)

plim

n→∞ sup

α∈A\Ac

¯¯

¯¯ L(α) E(L(α)) − 1

¯¯

¯¯ = 0. (6.32)

Hence, by (6.28), for ˆαCAIC defined in (6.2) and αL = arg minα∈AL(α), we can easily conclude that

ΓCAICαCAIC) = constant + LαCAIC) + op(LαCAIC)), ΓCAICL) = constant + LL) + op(LL)).

It follows that

0 ≤ ΓCAICL) − ΓCAICαCAIC)

LαCAIC) = LL) − LαCAIC)

LαCAIC) + op(1), and then

plim

n→∞

LL) − LαCAIC) LαCAIC) = 0, which gives plim

n→∞

LαCAIC

α∈Ainf L(α) = 1 when Ac= ∅.

Here, we first calculate EL(α). By (9.1), we have E(L(α)) = E(L(α)) − EkS − E(S|Z)k2

= E(L(α)) − σ²2tr(ΣηΣ−1)

= σ²4µ0A(α)0Σ−2A(α)µ + σ4²tr(Σ−1M (α)), (6.33) by (3.17). Now, we start to prove (6.29)-(6.32) one by one. For (6.29), the proof can be followed from the proof of (6.6) by replacing E(L(α)) with (6.33).

For (6.30), we have for any ε > 0,

where the third inequality follows from Σ−1/2ΣηΣ−1/2 ≤ I, the second last inequality follows from (6.33) and the last equality follows from (6.26).

For (6.31), we have for any ε > 0,

where the second inequality is an application of Theorem 2 of Whittle (1960) for some c1 > 0, and the third and fourth inequality follows from

σ²4tr(M (α)0Σ−2M (α)) ≤ σ2²tr(M (α)0Σ−1M (α)) = σ²2tr(Σ−1M (α)) ≤ σ²−2E(L(α)), and the last equality follows from (6.26).

Now, it remains to show (6.32). Here, we first expand L(α) from (6.10). That is L(α) = L(α) − kS − E(S|Z)k2

= L(α) − kσ²2Σ−1η − ΣηΣ−1²k2

= σ²4µ0A(α)0Σ−2A(α)µ + σ²4(η + ²)0M (α)0Σ−2M (α)(η + ²)

+2σ2²µ0A(α)0Σ−1²2Σ−1η − ΣηΣ−1²) − 2σ²4µ0A(α)0Σ−2M (α)(η + ²)

−2σ²22²Σ−1η − ΣηΣ−1²)0Σ−1M (α)(η + ²), (6.34)

the second equality follows from (9.1). It then follows together with (3.17), L(α) − E(L(α)) = σ4²(η + ²)0M (α)0Σ−2M (α)(η + ²) − σ4²tr(Σ−1M (α))

+2σ²2µ0A(α)0Σ−12²Σ−1η − ΣηΣ−1²) − 2σ²4µ0A(α)0Σ−2M (α)(η + ²)

−2σ²22²Σ−1η − ΣηΣ−1²)0Σ−2M (α)(η + ²).

Equation (6.32) can then be followed by plim

Note that the proofs of (6.35)-(6.38) can be followed from the proofs of (6.12)-(6.15) by replacing EL(α) with E(L(α)). Hence, (6.32) is then followed, which completes the proof when Ac= ∅.

Not, we suppose that Ac = {αc}. To show that the CAIC is still asymptotically loss efficient, it remains to show that for fixed αc,

Lc) = op(L(α)); if α ∈ A \ Ac, (6.39) ΓCAICc) = constant + Lc) + op(L(α)). (6.40) Hence, by (6.39) and (6.40), we can easily conclude that

n→∞lim P¡

Now, we start to prove (6.39). Equation (6.39) can be followed by (6.32) and plim

Equations (6.41) can then be followed by plim

Note that (6.42) can be followed similarly from the proof of (6.35) and (6.43) is trivial since σ²2tr(Σ−1M (αc)) ≤ p(αc) < ∞, and (6.44) can be followed similarly from the proof of (6.38). It then gives (6.39).

Now we start to prove (6.40). By (6.27), we have ΓCAICc) = constant + Lc) − 2σ²2²0M (αc)η − 2σ2²¡

²0Σ−1M (αc)² − σ²2tr(Σ−1M (αc))¢ . Equation (6.40) can then be followed by

plim

n→∞ sup

α∈A\Ac

0M (αc)η|

E(L(α)) = 0, plim

n→∞ sup

α∈A\Ac

0Σ−1M (αc)² − σ²2tr(Σ−1M (αc))|

E(L(α)) = 0,

which can be followed easily from (6.30) and (6.31). It then gives (6.40). This completes

the proof. 2

An example is given here for the Theorem 15.

Corollary 9 Consider a class of models given by (3.1) with xj(s)’s independently gener-ated from white-noise processes of (5.7), where p fixed and Ac = {αc}. If lim

n→∞tr(Σ−2) =

∞, then

plim

n→∞

L(ˆαCAIC) − kS − E(S|Z)k2 infα∈AL(α) − kS − E(S|Z)k2 = 1.

The model with smallest value of L(α) might not exist for |Ac| ≥ 2. If there are at least two correct models with fixed dimensions in Ac, there will be no asymptotic optimal properties under the level of loss comparison. We are then interested to ask if the model selection procedure still has some optimal properties on E(L(α)) in the cases of |Ac| ≥ 2.

Hence, we need a much more heavily penalty on model dimension to select αcamong Ac.

相關文件