Econ 7933: Financial Time Series: AR Models

(1)

Econ 7933: Financial Time Series: AR Models

Sheng-Kai Chang (NTU)

Fall, 2015

(2)

AR(1) model:

Form: r_t= φ₀+ φ₁r_t−1+ a_t, where φ₀ and φ₁ are real numbers.

(parameters to be estimated from the data in an application) Stationarity: necessary and sufficient condition |φ₁| < 1.

Mean: E (r_t) = _1−φ^φ⁰

1

Alternative AR(1) representation: let µ = E (rt) = _1−φ^φ⁰

1, thus φ₀= µ(1 − φ₁). Plugging in the model, we have

(r_t− µ) = φ₁(r_t−1− µ) + a_t. Variance: var (rt) = _1−φ^σ²^a2

1

.

Autocorrelations: ρ1 = φ1, ρ2 = φ²₁, etc. In general, ρk = φ^k₁, thus, ACF ρ_k decays exponentially as k increases.

Forecast: Suppose the forecast origin is t. Let xt = rt− µ, then the model becomes x_t = φ₁x_t−1+ a_t. Note that foercast of r_t can be obtained by the forecast of xt plus µ.

(3)

Forecast of AR(1) models

1-step ahead forecast at time t: ˆxt(1) = φ1xt.

1-step ahead forecast error: e_t(1) = x_t+1− ˆx_t(1) = a_t+1. a_t+1 is the unpredictable part of xt+1, it is the shock at time t + 1.

Variance of 1-step ahead forecast error: Var [et(1)] = Var (at+1) = σ²_a. 2-step ahead forecast at time t: ˆx_t(2) = φ₁xˆ_t(1) = φ²₁x_t.

2-step ahead forecast error: e_t(2) = x_t+2− ˆx_t(2) = a_t+2+ φ₁a_t+1. Variance of 2-step ahead forecast error: Var [et(2)] = (1 + φ²₁)σ²_a. Var [et(2)] is greater than Var [et(1)] indicating that uncertainty in forecasts increases as the number of steps increases.

(4)

Multi-step ahead forecast

l-step ahead forecast at time t: ˆxt(l ) = φ^l₁xt.

1-step ahead forecast error: e_t(l ) = a_t+l + φ₁a_{t+l −1}+ · · · + φ^{l −1}₁ a_t+1. Variance of 1-step ahead forecast error:

Var [et(l )] = (1 + φ²₁+ · · · + φ^{2(l −1)}₁ )σ²_a.

Mean reversion of the AR(1) precess: as l → ∞, ˆx_t(l ) → 0 and ˆ

rt(l ) → µ.

As l → ∞, variance of forecast error approaches:

Var [e_t(l )] = var (r_t) = _1−φ^σ²^a2 1

.

Therefore, the long-term forecasts, serial dependence is not

important. The forecast is just the sample mean and the uncertainty is the uncertainty about series.

Alternative AR(1) form: (1 − φ1L)rt = φ0+ at.

(5)

Half-life: a common way to quantify the speed of mean reversion is the half-life, which is defined as the mumber of periods needed so that the magnitude of the forecast becomes half of that of the forecast origin.

For AR(1) model, half-life means x_t(k) = φ^k₁x_t = ¹₂x_t. Thus, k = _ln(|φ^ln(0.5)

1|).

AR(2) model: r_t= φ₀+ φ₁r_t−1+ φ₂r_t−2+ a_t or (1 − φ1L − φ2L²)rt = φ0+ at.

Characteristic equation: 1 − φ1x − φ2x² = 0.

Characteristic roots: inverses of the solutions of characteristic equation.

Stationarity: the absolute values of all characteristic roots are less than one.

(6)

Mean: µ = E (r_t) = _1−φ^φ⁰

1−φ₂

Mean-adjusted format of AR(2) model: using φ0= µ − φ1µ − φ2µ, AR(2) model can be written as

(rt− µ) = φ₁(rt−1− µ) + φ₂(rt−2− µ) + a_t.

Mean-adjusted format is used to highlight the mean-reverting property of a stationary AR model.

ACF: ρ₀= 1, ρ₁ = _1−φ^φ¹

2, ρ_l = φ₁ρ_{l −1}+ φ₂ρ_{l −2} for l ≥ 2.

Stochastic business cycle: if φ²₁+ 4φ₂ < 0, then r_t shows characteristics of business cycles with average length k = _cos−1(φ1^2π/(2√

−φ₂)), where the cosine inverse is stated in radian.

If we denote the solution of the characteristoc equation as a ± bi , where i =√

−1, then we have φ₁= 2a and φ2 = −(a²+ b²) so that

k = ^2π

cos⁻¹(a/√

a²+b²). √

a²+ b² can be obtained using the command Mod in R.

(7)

y1=arima.sim(model=list(ar=c(1.3, -0.4)),1000) acf(y1,lag=20) # specify the number of ACF to compute

0 5 10 15 20

0.00.20.40.60.81.0

Lag

ACF

Series y1

(8)

pacf(y1,lag=20)

5 10 15 20

−0.4−0.20.00.20.40.60.8

Lag

Partial ACF

Series y1

(9)

y2=arima.sim(model=list(ar=c(0.8, -0.7)),1000) acf(y2,lag=20) # specify the number of ACF to compute

0 5 10 15 20

−0.50.00.51.0

Lag

ACF

Series y2

(10)

pacf(y2,lag=20)

5 10 15 20

−0.6−0.4−0.20.00.20.4

Lag

Partial ACF

Series y2

(11)

Order specification of AR model

Partial ACF: the PACF of a stationary time series is a function of its ACF and is useful tool for determining the the order p of an AR model.

For a stationary Gaussian AR(p) model, it can be shown that the sample PACF has the following properties:

- ˆφ_p,p converges to φ_p as the sample size T goes to infinity.

- ˆφl ,l converges to zero for all l > p.

- The asymptotic variance of ˆφl ,l is 1/T for l > p.

PACF cuts off at lag p for an AR(p) model.

Akaike information criterion: AIC (l ) = ln(˜σ_l²) +^2l_T for an AR(l) model, where ˜σ_l² is the MLE of residual variance.

Find the AR order with mnimum AIC for l ∈ [0, · · · , P].

BIC criterion: BIC (l ) = ln(˜σ²_l) +^{lln(T )}_T .

(12)

We can check the sample mean to determine whether or not we need a constant term.

Model checking: residuals can be obtained by 1-step ahead forecast errors at each time point, (observations minus the fitted value) Then if the model is adequate, residuals should be close to white noise. Use Ljung-Box statistics of residuals, but degrees of freedom is m − g , where g is the number of AR coefficients used in the model.

(13)

#setwd("D:/R/FTS/Data") library(downloader) library(fBasics) da=read.table("dgnp82.txt") x=da[,1]

par(mfcol=c(2,1)) plot(x,type='l')

#plot(x[1:175],x[2:176])

#plot(x[1:174],x[3:176]) acf(x,lag=12)

0 50 100 150

−0.020.010.03

Index

x

0 2 4 6 8 10 12

0.00.40.8

Lag

ACF

Series x

(14)

pacf(x,lag.max=12)

2 4 6 8 10 12

−0.10.00.10.20.3

Lag

Partial ACF

Series x

Box.test(x,lag=10,type='Ljung')

##

## Box-Ljung test

##

## data: x

## X-squared = 43.234, df = 10, p-value = 4.515e-06

(15)

m1=ar(x, method='mle') m1

##

## Call:

## ar(x = x, method = "mle")

##

## Coefficients:

## 1 2 3

## 0.3480 0.1793 -0.1423

##

## Order selected 3 sigma^2 estimated as 9.427e-05 Box.test(m1$resid,lag=10,type='Ljung')

##

## Box-Ljung test

##

## data: m1$resid

## X-squared = 7.0808, df = 10, p-value = 0.7178

(16)

m2=arima(x,order=c(3,0,0)) m2

##

## Call:

## arima(x = x, order = c(3, 0, 0))

##

## Coefficients:

## ar1 ar2 ar3 intercept

## 0.3480 0.1793 -0.1423 0.0077

## s.e. 0.0745 0.0778 0.0745 0.0012

##

## sigma^2 estimated as 9.427e-05: log likelihood = 565.84, aic = -1121.68 Box.test(m2$residuals,lag=10,type='Ljung')

##

## Box-Ljung test

##

## data: m2$residuals

## X-squared = 7.0169, df = 10, p-value = 0.7239

(17)

tsdiag(m2)

Standardized Residuals

Time

0 50 100 150

−3−113

0 5 10 15 20

0.00.40.8

Lag

ACF

ACF of Residuals

2 4 6 8 10

0.00.40.8

p values for Ljung−Box statistic

lag

p value

(18)

p1=c(1,-m2$coef[1:3]) roots=polyroot(p1) roots

## [1] 1.590253+1.063882i -1.920152+0.000000i 1.590253-1.063882i Mod(roots)

## [1] 1.913308 1.920152 1.913308

k=2*pi/acos(1.590253/1.913308) k

## [1] 10.65638 predict(m2,8)

## $pred

## Time Series:

## Start = 177

## End = 184

## Frequency = 1

## [1] 0.001236254 0.004555519 0.007454906 0.007958518 0.008181442 0.007936845

## [7] 0.007820046 0.007703826

##

## $se

## Time Series:

## Start = 177

## End = 184

## Frequency = 1

## [1] 0.009709322 0.010280510 0.010686305 0.010688994 0.010689733 0.010694771

## [7] 0.010695511 0.010696190

(19)

require(quantmod)

getSymbols("UNRATE",src="FRED")

## [1] "UNRATE"

chartSeries(UNRATE,theme="white")

4 6 8 10 UNRATE [1948−01−01/2015−08−01]

Last 5.1

... 1948 ... 1960 ... 1975 ... 1990 ... 2005

(20)

rate<- as.numeric(UNRATE[,1]) ts.plot(rate)

Time

rate

0 200 400 600 800

46810

(21)

acf(rate)

0 5 10 15 20 25 30

0.00.20.40.60.81.0

Lag

ACF

Series rate

(22)

zt=diff(rate) acf(zt)

0 5 10 15 20 25 30

0.00.20.40.60.81.0

Lag

ACF

Series zt

(23)

pacf(zt)

0 5 10 15 20 25 30

−0.10.00.10.20.3

Lag

Partial ACF

Series zt

(24)

m1=ar(zt,lag.max=20,method="mle") names(m1)

## [1] "order" "ar" "var.pred" "x.mean"

## [5] "aic" "n.used" "order.max" "partialacf"

## [9] "resid" "method" "series" "frequency"

## [13] "call" "asy.var.coef"

m1$order

## [1] 12 t.test(zt)

##

## One Sample t-test

##

## data: zt

## t = 0.28142, df = 810, p-value = 0.7785

## alternative hypothesis: true mean is not equal to 0

## 95 percent confidence interval:

## -0.01252466 0.01671701

## sample estimates:

## mean of x

## 0.002096178

(25)

m2=arima(rate,order=c(12,1,0)) m2

##

## Call:

## arima(x = rate, order = c(12, 1, 0))

##

## Coefficients:

## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8

## 0.0161 0.2192 0.1466 0.0988 0.1329 0.0038 -0.0365 0.0157

## s.e. 0.0349 0.0349 0.0357 0.0362 0.0363 0.0366 0.0366 0.0364

## ar9 ar10 ar11 ar12

## 0.0035 -0.0872 0.0277 -0.1288

## s.e. 0.0363 0.0358 0.0351 0.0351

##

## sigma^2 estimated as 0.03732: log likelihood = 182.29, aic = -338.59

tsdiag(m2,gof=24)

Time

0 200 400 600 800

−8−404

0 5 10 15 20 25 30

−0.20.20.61.0

Lag

ACF

5 10 15 20

0.00.40.8

lag

p value

(26)

c1=c(0,NA,NA,NA,NA,0,0,0,0,NA,0,NA) m2a=arima(rate,order=c(12,1,0),fixed=c1) m2a

##

## Call:

## arima(x = rate, order = c(12, 1, 0), fixed = c1)

##

## Coefficients:

## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10

## 0 0.2159 0.1520 0.1010 0.1309 0 0 0 0 -0.0854

## s.e. 0 0.0346 0.0338 0.0341 0.0351 0 0 0 0 0.0345

## ar11 ar12

## 0 -0.1295

## s.e. 0 0.0340

##

## sigma^2 estimated as 0.03742: log likelihood = 181.21, aic = -348.43 predict(m2a,4)

## $pred

## Time Series:

## Start = 813

## End = 816

## Frequency = 1

## [1] 5.083977 5.076662 4.995143 5.013901

##

## $se

## Time Series:

## Start = 813

## End = 816

## Frequency = 1

## [1] 0.1934315 0.2735534 0.3607559 0.4473817

(27)

m3=arima(rate,order=c(2,1,1),seasonal=list(order=c(1,0,1),period=12)) m3

##

## Call:

## arima(x = rate, order = c(2, 1, 1), seasonal = list(order = c(1, 0, 1), period = 12))

##

## Coefficients:

## ar1 ar2 ma1 sar1 sma1

## 0.6024 0.2337 -0.6010 0.5496 -0.8170

## s.e. 0.0601 0.0382 0.0552 0.0669 0.0481

##

## sigma^2 estimated as 0.03572: log likelihood = 198.68, aic = -385.36 tsdiag(m3,gof=24)

Time

0 200 400 600 800

−8−404

0 5 10 15 20 25 30

0.00.40.8

Lag

ACF

5 10 15 20

0.00.40.8

lag

p value

(28)

predict(m3,4)

## $pred

## Time Series:

## Start = 813

## End = 816

## Frequency = 1

## [1] 5.131360 5.095855 5.035646 5.086119

##

## $se

## Time Series:

## Start = 813

## End = 816

## Frequency = 1

## [1] 0.1889930 0.2674675 0.3551090 0.4403252

(29)

source('backtest.R') length(rate)

## [1] 812

backtest(m2a,rate,760,1)

## [1] "RMSE of out-of-sample forecasts"

## [1] 0.1457141

## [1] "Mean absolute error of out-of-sample forecasts"

## [1] 0.1192522

backtest(m3,rate,760,1)

## [1] "RMSE of out-of-sample forecasts"

## [1] 0.1305156

## [1] "Mean absolute error of out-of-sample forecasts"

## [1] 0.1089834