Large Sample Theory

(1)

Large Sample Theory

Homework 5: Maximum Likelihood Estimate, Testing, Asymptotic Distribution Due Date: January 12th

1. Consider the classical Gaussian linear model Y_i = µ_i + _i, 1 ≤ i ≤ n, where µ_i = z^T_i β and _iare i.i.d. Gaussian with mean 0 and variance σ². Here z_iare d-dimensional vectors for covariate values. Suppose that the covariates are ranked in order of importance. (It means that the first covariate is the most important and etc.)

To entertain the possibility that the last d − p don’t matter, β_p+1 = · · · = β_d = 0. Let βˆ^(p) be the least-squares estimate with β_p+1= · · · = β_d = 0 and ˆY_i^(p)the corresponding fitted value.

In this fashion, we end up d possible regression models. Now the problem is which one to use. A natural goal to entertain is to obtain new values Y₁^∗, . . . , Y_n^∗ at z₁, . . . , z_n and evaluate the performance of ˆY₁^(p), . . . , ˆY_n^(p) as estimates of Y₁^∗, . . . , Y_n^∗ and, hence, the model with βd+1 = · · · = β_p = 0 by the (average) expected prediction error

EP E(p) = n⁻¹E

n

X

i=1

(Y_i^∗− ˆY_i^(p))².

Here Y₁^∗, . . . , Y_n^∗are independent of Y1, . . . , Y_nand Y_i^∗is distributed as Yi, i = 1, . . . , n.

Let RSS(p) = ^Pⁿ_i=1(Y_i − ˆY_i^(p))² be the residual sum of squares. Suppose that σ² is known.

(a.) Show that

EP E(p) = σ²

1 + p n

+ 1 n

n

X

i=1

(µ_i− µ^(p)_i )²

where µ^(p)_i = z^T_i βˆ^(p) and ˆβ^(p) = (β₁, . . . , β_p, 0, . . . , 0)^T. (b). Show that

E[RSS(p)] = σ²

1 − p n

+ 1 n

n

X

i=1

(µ_i− µ^(p)_i )². (c). Show that RSS(p) + (2p/n)σ² is an unbiased estimate of EP E(p).

(d). Mallow (1973, Technometrics) suggested a model selection rule in which p is se- lected to be the one minimizes RSS(p) + (2p/n)σ² and then using ˆY(ˆp) as a predictor. Suppose p = 2 and d = 3. Find the probability that P (ˆp = 3) and P (ˆp ≤ 1) when n goes to infinity. (You can assume that those covariates are realized values of 3 independent U N IF (0, 1) random variables. For example, µi = β1zi1 + β2zi2 + β3zi3 where zi1, zi2, and zi3 are independent U N IF (0, 1) random variables.

2. Consider model Y = Xβ + where E() = 0 and V ar() = σ²J_n. Let ˆY_i = X_iβ andˆ h_ii= X_i(X^TX)⁻¹X^T_i .

(a) Show that for any > 0,

P (| ˆY_i− E( ˆY_i)| ≥ ) ≥ min[P (_i ≥ /h_ii), [P (_i ≤ −/h_ii)].

(Hint: for independent random variables X and Y , P (|X +Y | ≥ ) ≥ P (X ≥ )P (Y ≥ 0) + P (X ≤ −)P (Y < 0).)

(b) Show that ˆY_i− E( ˆY_i)→ 0 if and only if h^P _ii→ 0.

1

(2)

3. Let (Xi, Yi), 1 ≤ i ≤ n, be iid with Xi and Yi independent, N (θ1, 1), N (θ2, 1), respectively. Suppose θ₁ ≥ 0 and θ₁ ≥ θ₂ ≥ 0. Consider testing H₀ : θ₁ = θ₂ = 0 versus H₁ : θ₁ > 0 or θ₂ > 0. Show that whatever be n, under H₀, λ_nis distributed as a mixture of point mass at 0, χ²₁ and χ²₂ with probabilities 3/8, 1/2, 1/8, respectively.

4. Let (X11, X12), . . . , (Xn1, Xn2) be i.i.d. from a bivariate normal distribution with un- known mean and covariance matrix. For testing H0 : ρ = 0 versus H₁ : ρ 6= 0, where ρ is the correlation coefficient, show that the test rejecting H₀ when |W | > 0 is an LR test, where

W =

n

X

i=1

(X_i1− ¯X₁)(X_i2− ¯X₂)/

" _n X

i=1

(X_i1− ¯X₁)²+

n

X

i=1

(X_i2− ¯X₂)²

#

.

Find the distribution of W under H0.

5. Suppose you are studying the number of visitations of a pollinator to a flower. Your hypothesis is that yellow flowers are better than red flowers (in terms of pollinator at- traction). Previous studies have found that the number of visitors to red flowers follows a normal distribution with a mean of 200 visits per flower and a variance of 50. Suppose in a sample of 20 yellow flowers that the mean number of visits is 202 with a known variance (of visits per flower) of 50. Again, assume the number of visitors is normally distributed.

a. What is the probability of this data under the null hypothesis (yellow and red flowers are equivalent)?

b. What is the critical value for a (one-sided) test of the null hypothesis at the α = 0.05 level?

c. What are the values for (a) and (b) when the variance for yellow flowers (50) is instead a SAMPLE variance (i.e., an estimate of the true variance)? Hint: Would you now use a normal or a t distribution?

d. Suppose that yellow flowers are indeed better. Given the sample size (20) and assuming the variance (50) is the true value, how small an effect can we detect using a (one-sided) test of significance of α = 0.05 with 80% power?

e. Repeat the calculation in (d) assuming that the variance (50) is now an estimated value, not necessarily the true value.

f. Suppose the true mean and variance for yellow flowers are 201 and 10. How large a sample size is required to have a power of 80 percent of detecting a difference between red and yellow using a test of significance with level α = 0.05? Compute this for both the normal (variance assumed know) and t (variance estimated) settings.

g. If the true variance for yellow is 35, what is the probability that we observe a sample variance of 50 (or larger) given our sample size of 20.

6. Let X₁, X₂, . . . , X_nbe a random sample from the unif (0, θ) distribution for some θ >

0. Suppose we wish to test

H₀ : θ = θ₀ versus H_a: θ < θ₀ at level (size) α. Suppose that we use test statistic X_(n).

a. Derive the test with the probability of a Type I error α.

2

(3)

b. What is the probability of a type Type II error for any particular θ = θ1where θ1 is some fixed number less than θ₀?

c. What is the power function of this test?

d. What sample size is necessary in order to get β(θ₁) = β where β is a fixed number between 0 and 1 and θ1 is a fixed value between 0 and θ0?

7. Let X₁, . . . , X_n be the times in months until failure of n similar pieces of equipment.

Since the equipment is subject to wear, we often model X₁, . . . , X_nas a random sample of size n from a Weibull distribution with density f (x, λ) = λcx^c−1exp(−λx^c), x > 0.

Here c is a known positive constant and λ > 0.

a. Find an optimal test for testing H₀ : 1/λ ≤ 1/λ₀versus H_a: 1/λ > 1/λ₀.

b. Suppose that the only table you have is a normal probability table. Can you use this table to carry out the test derived in (a)? Give reasons to justify your answer.

8. Let Xn be a random variable having the Poisson distribution P (nθ), where θ > 0, n = 1, 2, . . .. Show that (X_n− nθ)/√

nθ→ N (0, 1).^d

9. Let U₁, . . . , U_n be i.i.d. random variables having the uniform distribution on [0, 1] and Y_n= (^Qⁿ_i=1U_i)^−1/n. Show that√

n(Y_n− e)→ N (0, e^d ²).

10. Set ˆσ =^qn⁻¹^Pⁿ_i=1(X_i− ¯X)². Show that√

n(ˆσ − σ)→ N (0, σ^d ²/2).

11. Let X1, . . . , Xnbe i.i.d. N (θ, 1) with θ ≥ 0.

(a) Show that the MLE of θ, ˆθ_n, is ¯X if ¯X > 0 and 0 otherwise.

(b) If θ > 0, show that√

n(ˆθ_n− θ)→ N (0, 1).^L

(c) If θ = 0, the probability is 1/2 that ˆθ_n = 0 and 1/2 that√

n(ˆθ_n− θ)→ N (0, 1).^L 12. If X₁, . . . , X_nare i.i.d. according to U (0, θ) and T_n = X_(n), the limiting distribution of

n(θ − T_n) is exponential with density θ⁻¹exp(−x/θ). Use this result to determine the limit distribution of

(a) n[f (θ) − f (T_n)], where f is any function with f⁽¹⁾(θ) 6= 0;

(b) [f (θ) − f (T_n)] is suitably normalized if f⁽¹⁾(θ) = 0 but f⁽²⁾(θ) 6= 0.

13. Let X1, . . . , Xnbe i.i.d. N (θ, σ²) and consider the estimation of θ². (a) Find the maximum likelihood estimator.

(b) Obtain the limit distribution of the estimators obtained in (a). (Hint: You may need to consider θ 6= 0 and θ = 0 separately.)

14. Let X1, . . . , X_n be i.i.d. with E(Xi) = θ, V ar(X_i) = σ² < ∞, and let δ_n = ¯X with probability 1 − _nand δ_n= A_nwith probability _n. If _nand A_nare constants satisfying

_n→ 0 and _nA_n→ ∞,

then δnis consistent for estimating θ, but E(δn− θ)² does not tend to zero.

15. Suppose that Xnis a random variable having the binomial distribution Bin(n, p), where 0 < p < 1, n = 1, 2, . . .. Define

Yn =

( log(Xn/n) Xn ≥ 1

1 X_n = 0.

Show that Y_n ^a.s.→ log p and√

n(Y_n− log p)→ N (0, (1 − p)/p).^d 3

(4)

16. Let X1, . . . , Xnbe iid random variables with V ar(X1) < ∞. Show that 2

n(n + 1)

n

X

j=1

jXj

→ EXP 1.

17. Let (Y₁, Z₁), . . . , (Y_n, Z_n) be i.i.d. with the Lebesgue pdf λ⁻¹µ⁻¹e^−y/λe^−z/µI(0,∞)(y)I(0,∞)(z), where λ > 0 and µ > 0.

(a) Find the MLE of (λ, µ).

(b) Suppose that we only observe Xi = min(Y_i, Z_i) and δ_i = 1 if X_i = Y_i and δi = 0 if X_i = Z_i. Find the MLE of (λ, µ).

18. Let X be N (0, θ), 0 < θ < ∞. Find the Fisher information I(θ).

4