Econ 7933: Financial Time Series: AR Models
Sheng-Kai Chang (NTU)
Fall, 2015
AR(1) model:
Form: rt= φ0+ φ1rt−1+ at, where φ0 and φ1 are real numbers.
(parameters to be estimated from the data in an application) Stationarity: necessary and sufficient condition |φ1| < 1.
Mean: E (rt) = 1−φφ0
1
Alternative AR(1) representation: let µ = E (rt) = 1−φφ0
1, thus φ0= µ(1 − φ1). Plugging in the model, we have
(rt− µ) = φ1(rt−1− µ) + at. Variance: var (rt) = 1−φσ2a2
1
.
Autocorrelations: ρ1 = φ1, ρ2 = φ21, etc. In general, ρk = φk1, thus, ACF ρk decays exponentially as k increases.
Forecast: Suppose the forecast origin is t. Let xt = rt− µ, then the model becomes xt = φ1xt−1+ at. Note that foercast of rt can be obtained by the forecast of xt plus µ.
Forecast of AR(1) models
1-step ahead forecast at time t: ˆxt(1) = φ1xt.
1-step ahead forecast error: et(1) = xt+1− ˆxt(1) = at+1. at+1 is the unpredictable part of xt+1, it is the shock at time t + 1.
Variance of 1-step ahead forecast error: Var [et(1)] = Var (at+1) = σ2a. 2-step ahead forecast at time t: ˆxt(2) = φ1xˆt(1) = φ21xt.
2-step ahead forecast error: et(2) = xt+2− ˆxt(2) = at+2+ φ1at+1. Variance of 2-step ahead forecast error: Var [et(2)] = (1 + φ21)σ2a. Var [et(2)] is greater than Var [et(1)] indicating that uncertainty in forecasts increases as the number of steps increases.
Multi-step ahead forecast
l-step ahead forecast at time t: ˆxt(l ) = φl1xt.
1-step ahead forecast error: et(l ) = at+l + φ1at+l −1+ · · · + φl −11 at+1. Variance of 1-step ahead forecast error:
Var [et(l )] = (1 + φ21+ · · · + φ2(l −1)1 )σ2a.
Mean reversion of the AR(1) precess: as l → ∞, ˆxt(l ) → 0 and ˆ
rt(l ) → µ.
As l → ∞, variance of forecast error approaches:
Var [et(l )] = var (rt) = 1−φσ2a2 1
.
Therefore, the long-term forecasts, serial dependence is not
important. The forecast is just the sample mean and the uncertainty is the uncertainty about series.
Alternative AR(1) form: (1 − φ1L)rt = φ0+ at.
Half-life: a common way to quantify the speed of mean reversion is the half-life, which is defined as the mumber of periods needed so that the magnitude of the forecast becomes half of that of the forecast origin.
For AR(1) model, half-life means xt(k) = φk1xt = 12xt. Thus, k = ln(|φln(0.5)
1|).
AR(2) model: rt= φ0+ φ1rt−1+ φ2rt−2+ at or (1 − φ1L − φ2L2)rt = φ0+ at.
Characteristic equation: 1 − φ1x − φ2x2 = 0.
Characteristic roots: inverses of the solutions of characteristic equation.
Stationarity: the absolute values of all characteristic roots are less than one.
Mean: µ = E (rt) = 1−φφ0
1−φ2
Mean-adjusted format of AR(2) model: using φ0= µ − φ1µ − φ2µ, AR(2) model can be written as
(rt− µ) = φ1(rt−1− µ) + φ2(rt−2− µ) + at.
Mean-adjusted format is used to highlight the mean-reverting property of a stationary AR model.
ACF: ρ0= 1, ρ1 = 1−φφ1
2, ρl = φ1ρl −1+ φ2ρl −2 for l ≥ 2.
Stochastic business cycle: if φ21+ 4φ2 < 0, then rt shows characteristics of business cycles with average length k = cos−1(φ12π/(2√
−φ2)), where the cosine inverse is stated in radian.
If we denote the solution of the characteristoc equation as a ± bi , where i =√
−1, then we have φ1= 2a and φ2 = −(a2+ b2) so that
k = 2π
cos−1(a/√
a2+b2). √
a2+ b2 can be obtained using the command Mod in R.
y1=arima.sim(model=list(ar=c(1.3, -0.4)),1000) acf(y1,lag=20) # specify the number of ACF to compute
0 5 10 15 20
0.00.20.40.60.81.0
Lag
ACF
Series y1
pacf(y1,lag=20)
5 10 15 20
−0.4−0.20.00.20.40.60.8
Lag
Partial ACF
Series y1
y2=arima.sim(model=list(ar=c(0.8, -0.7)),1000) acf(y2,lag=20) # specify the number of ACF to compute
0 5 10 15 20
−0.50.00.51.0
Lag
ACF
Series y2
pacf(y2,lag=20)
5 10 15 20
−0.6−0.4−0.20.00.20.4
Lag
Partial ACF
Series y2
Order specification of AR model
Partial ACF: the PACF of a stationary time series is a function of its ACF and is useful tool for determining the the order p of an AR model.
For a stationary Gaussian AR(p) model, it can be shown that the sample PACF has the following properties:
- ˆφp,p converges to φp as the sample size T goes to infinity.
- ˆφl ,l converges to zero for all l > p.
- The asymptotic variance of ˆφl ,l is 1/T for l > p.
PACF cuts off at lag p for an AR(p) model.
Akaike information criterion: AIC (l ) = ln(˜σl2) +2lT for an AR(l) model, where ˜σl2 is the MLE of residual variance.
Find the AR order with mnimum AIC for l ∈ [0, · · · , P].
BIC criterion: BIC (l ) = ln(˜σ2l) +lln(T )T .
We can check the sample mean to determine whether or not we need a constant term.
Model checking: residuals can be obtained by 1-step ahead forecast errors at each time point, (observations minus the fitted value) Then if the model is adequate, residuals should be close to white noise. Use Ljung-Box statistics of residuals, but degrees of freedom is m − g , where g is the number of AR coefficients used in the model.
#setwd("D:/R/FTS/Data") library(downloader) library(fBasics) da=read.table("dgnp82.txt") x=da[,1]
par(mfcol=c(2,1)) plot(x,type='l')
#plot(x[1:175],x[2:176])
#plot(x[1:174],x[3:176]) acf(x,lag=12)
0 50 100 150
−0.020.010.03
Index
x
0 2 4 6 8 10 12
0.00.40.8
Lag
ACF
Series x
pacf(x,lag.max=12)
2 4 6 8 10 12
−0.10.00.10.20.3
Lag
Partial ACF
Series x
Box.test(x,lag=10,type='Ljung')
##
## Box-Ljung test
##
## data: x
## X-squared = 43.234, df = 10, p-value = 4.515e-06
m1=ar(x, method='mle') m1
##
## Call:
## ar(x = x, method = "mle")
##
## Coefficients:
## 1 2 3
## 0.3480 0.1793 -0.1423
##
## Order selected 3 sigma^2 estimated as 9.427e-05 Box.test(m1$resid,lag=10,type='Ljung')
##
## Box-Ljung test
##
## data: m1$resid
## X-squared = 7.0808, df = 10, p-value = 0.7178
m2=arima(x,order=c(3,0,0)) m2
##
## Call:
## arima(x = x, order = c(3, 0, 0))
##
## Coefficients:
## ar1 ar2 ar3 intercept
## 0.3480 0.1793 -0.1423 0.0077
## s.e. 0.0745 0.0778 0.0745 0.0012
##
## sigma^2 estimated as 9.427e-05: log likelihood = 565.84, aic = -1121.68 Box.test(m2$residuals,lag=10,type='Ljung')
##
## Box-Ljung test
##
## data: m2$residuals
## X-squared = 7.0169, df = 10, p-value = 0.7239
tsdiag(m2)
Standardized Residuals
Time
0 50 100 150
−3−113
0 5 10 15 20
0.00.40.8
Lag
ACF
ACF of Residuals
2 4 6 8 10
0.00.40.8
p values for Ljung−Box statistic
lag
p value
p1=c(1,-m2$coef[1:3]) roots=polyroot(p1) roots
## [1] 1.590253+1.063882i -1.920152+0.000000i 1.590253-1.063882i Mod(roots)
## [1] 1.913308 1.920152 1.913308
k=2*pi/acos(1.590253/1.913308) k
## [1] 10.65638 predict(m2,8)
## $pred
## Time Series:
## Start = 177
## End = 184
## Frequency = 1
## [1] 0.001236254 0.004555519 0.007454906 0.007958518 0.008181442 0.007936845
## [7] 0.007820046 0.007703826
##
## $se
## Time Series:
## Start = 177
## End = 184
## Frequency = 1
## [1] 0.009709322 0.010280510 0.010686305 0.010688994 0.010689733 0.010694771
## [7] 0.010695511 0.010696190
require(quantmod)
getSymbols("UNRATE",src="FRED")
## [1] "UNRATE"
chartSeries(UNRATE,theme="white")
4 6 8 10 UNRATE [1948−01−01/2015−08−01]
Last 5.1
... 1948 ... 1960 ... 1975 ... 1990 ... 2005
rate<- as.numeric(UNRATE[,1]) ts.plot(rate)
Time
rate
0 200 400 600 800
46810
acf(rate)
0 5 10 15 20 25 30
0.00.20.40.60.81.0
Lag
ACF
Series rate
zt=diff(rate) acf(zt)
0 5 10 15 20 25 30
0.00.20.40.60.81.0
Lag
ACF
Series zt
pacf(zt)
0 5 10 15 20 25 30
−0.10.00.10.20.3
Lag
Partial ACF
Series zt
m1=ar(zt,lag.max=20,method="mle") names(m1)
## [1] "order" "ar" "var.pred" "x.mean"
## [5] "aic" "n.used" "order.max" "partialacf"
## [9] "resid" "method" "series" "frequency"
## [13] "call" "asy.var.coef"
m1$order
## [1] 12 t.test(zt)
##
## One Sample t-test
##
## data: zt
## t = 0.28142, df = 810, p-value = 0.7785
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.01252466 0.01671701
## sample estimates:
## mean of x
## 0.002096178
m2=arima(rate,order=c(12,1,0)) m2
##
## Call:
## arima(x = rate, order = c(12, 1, 0))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8
## 0.0161 0.2192 0.1466 0.0988 0.1329 0.0038 -0.0365 0.0157
## s.e. 0.0349 0.0349 0.0357 0.0362 0.0363 0.0366 0.0366 0.0364
## ar9 ar10 ar11 ar12
## 0.0035 -0.0872 0.0277 -0.1288
## s.e. 0.0363 0.0358 0.0351 0.0351
##
## sigma^2 estimated as 0.03732: log likelihood = 182.29, aic = -338.59
tsdiag(m2,gof=24)
Standardized Residuals
Time
0 200 400 600 800
−8−404
0 5 10 15 20 25 30
−0.20.20.61.0
Lag
ACF
ACF of Residuals
5 10 15 20
0.00.40.8
p values for Ljung−Box statistic
lag
p value
c1=c(0,NA,NA,NA,NA,0,0,0,0,NA,0,NA) m2a=arima(rate,order=c(12,1,0),fixed=c1) m2a
##
## Call:
## arima(x = rate, order = c(12, 1, 0), fixed = c1)
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10
## 0 0.2159 0.1520 0.1010 0.1309 0 0 0 0 -0.0854
## s.e. 0 0.0346 0.0338 0.0341 0.0351 0 0 0 0 0.0345
## ar11 ar12
## 0 -0.1295
## s.e. 0 0.0340
##
## sigma^2 estimated as 0.03742: log likelihood = 181.21, aic = -348.43 predict(m2a,4)
## $pred
## Time Series:
## Start = 813
## End = 816
## Frequency = 1
## [1] 5.083977 5.076662 4.995143 5.013901
##
## $se
## Time Series:
## Start = 813
## End = 816
## Frequency = 1
## [1] 0.1934315 0.2735534 0.3607559 0.4473817
m3=arima(rate,order=c(2,1,1),seasonal=list(order=c(1,0,1),period=12)) m3
##
## Call:
## arima(x = rate, order = c(2, 1, 1), seasonal = list(order = c(1, 0, 1), period = 12))
##
## Coefficients:
## ar1 ar2 ma1 sar1 sma1
## 0.6024 0.2337 -0.6010 0.5496 -0.8170
## s.e. 0.0601 0.0382 0.0552 0.0669 0.0481
##
## sigma^2 estimated as 0.03572: log likelihood = 198.68, aic = -385.36 tsdiag(m3,gof=24)
Standardized Residuals
Time
0 200 400 600 800
−8−404
0 5 10 15 20 25 30
0.00.40.8
Lag
ACF
ACF of Residuals
5 10 15 20
0.00.40.8
p values for Ljung−Box statistic
lag
p value
predict(m3,4)
## $pred
## Time Series:
## Start = 813
## End = 816
## Frequency = 1
## [1] 5.131360 5.095855 5.035646 5.086119
##
## $se
## Time Series:
## Start = 813
## End = 816
## Frequency = 1
## [1] 0.1889930 0.2674675 0.3551090 0.4403252
source('backtest.R') length(rate)
## [1] 812
backtest(m2a,rate,760,1)
## [1] "RMSE of out-of-sample forecasts"
## [1] 0.1457141
## [1] "Mean absolute error of out-of-sample forecasts"
## [1] 0.1192522
backtest(m3,rate,760,1)
## [1] "RMSE of out-of-sample forecasts"
## [1] 0.1305156
## [1] "Mean absolute error of out-of-sample forecasts"
## [1] 0.1089834