Evaluation measures - On the Forecasting Performance of HAR and MIDAS

Chapter 2: On the Forecasting Performance of HAR and MIDAS

2.4 Evaluation measures

Patton (2006) showed that the mean square error (MSE) loss function is robust with regards to the volatility proxy used. In this chapter, we will use the MSE as a comparing criterion. In the realized return-based volatility, let RV_{t t H}_,₊ denote the true

value of the RV for the H days, and let RVt t H,+ denote the predicted value of the dependent variable. In the realized range-based volatility, RV is replaced by RRV. Then the MSE is given by where N is the number of forecasts. As denoted by Forsberg and Ghysels (2007), in order to be able to compare the MSE from the regressions, we undo the transformation, . .i e , the MSE of the RV in standard deviation and log form are

Using these measures, we can compare the models with different transformations of the dependent variable.

Moreover, we will compare both the HAR and MIDAS models. Because the dependent variables are not identical, we will compare their relative decreasing ratio of MSE.

3 Data and empirical results 3.1 Data descriptions

This study employs 5-minute intra-day data of the S&P 500 index securities. The intraday data are obtained from Tick Data Inc., covering the period from January 1, 1995 to March 31, 2005, and consisting of 2535 days with 78 intra-day 5-minute observations. Table 1 shows that the descriptive statistics of the data. Panel A represents the descriptive statistics of the realized return-based variation. Panel B represents the descriptive statistics of the realized range-based variation. LB ₁₀ reports the Liung-Box test statistics for up to the tenth order serial correlation. RV _t denotes the realized variance, RRV is the realized range-based variance; _t C is the _t continuous part, and J is the jump part of _t RV (or_t RRV ) as separated by the _t bipower jump test of Barndorff-Nielsen and Shephard (2004a). For the bipower jump test, a significance level α =0.999 is used, and the critical value of LB is 18.3070. ₁₀

( )

t t

BPV RBV denotes the realized bipower (realized range bipower) variation.

Table 1 shows that the realized range-based variation is a more efficient volatility estimator than the realized return-based variation because the standard deviation of RRV is smaller than the standard deviation of _t RV . But note that the _t means of the realized range-based variations are smaller than the realized variations,

. .

i e , using the realized range-based variations to estimate the latent volatility will induce a downward bias. This is because the price path is not observed continuously.

The observed minimum and maximum price over- and underestimates the true minimum and maximum, respectively. Studying the LB statistics, the RV (_t RRV ) _t exhibits the highest degree of serial correlation for all the transformations, which denote that these volatility measures have higher persistence.

3.2 In-sample empirical results

In Table 2, we examine the in-sample fit for the HAR regressions of the realized variance (HAR-RV) and the realized range-based variance (HAR-RRV), using the S&P 500 cash index data set. Panel B and C denote the standard deviation and log transformation of the variances respectively. We focus on five different prediction horizons, one day, one, two, three and four weeks, corresponding to RV_{t t H}_,₊ and

, t t H

RRV ₊ for H =1, 5, 10, 15, and 20 respectively. For both the HAR-RV and the HAR-RRV regressions, comparing across prediction horizons, the MSE is always the lowest when the horizon is two weeks, thus indicating that the MSE for the two weeks prediction horizon is more precise than the other horizons. Furthermore, the relative decreasing ratios of the mean square errors of HAR-RRV regressions are always larger than those of the HAR-RV regressions. That is to say, using the realized range-based volatility to predict the latent volatility proxy is more precise than the realized volatility.

In Table 3, we examine the in-sample fit for the MIDAS regressions of the realized variance (MIDAS-RV) and the realized range-based variance (MIDAS-RRV), using the S&P 500 cash index data set. We can obtain similar results as the above ones.

For both the MIDAS-RV and the MIDAS-RRV regressions, when we predict volatility for two or three weeks horizons, we can get more accurate predictions. The relative decreasing ratios of the mean square errors of MIDAS-RRV regressions are

always larger than those of the MIDAS-RV regressions, i e , using the realized . . range-based volatility to predict the latent volatility is more precise than the realized volatility.

3.3 Conditional HAR models

If there is a jump at time t , the jump would imply a different dynamic of the continuous part of the price process. Table 4 reports the result of the Chow test of the corresponding F statistics and P values− . The realized range-based regressions are significant at one day and one week horizons, i e , the price process is not . . invariant to jumps at short-run volatility forecasting, but the realized return-based regressions are almost invariant to jumps. The results are similar when we model the realized return and range-based variances in the standard deviation and the log form, as reported in Table 4.

Besides, by looking at the results in Table 5, there are similar results as the above ones. For both the conditional HAR-RV and the conditional HAR-RRV regressions, when we predict volatility for two or three weeks horizons, we can get more accurate predictions. In addition, the relative decreasing ratios of the mean square errors of HAR-RRV regressions are almost larger than those of the HAR-RV regressions, . .i e , using the realized range-based volatility of the conditional HAR regressions to predict the latent volatility, which is more precise than the realized volatility.

3.4 Out-of-sample empirical results

Table 6 and Table 7 report the out-of-sample results both for the HAR and MIDAS regressions of the realized volatility and the realized range-based volatility. We split the data into two parts: a sample part to estimate these two models and an

out-of-sample part for forecasting. In this study, we explore the rolling window analysis to predict the out-of-sample forecasting. The rolling window’s width is 2400 observations and the windows are rolled through the sample once for at a time; there will be 135 rolling estimates for each parameter. Similarly, there will be 131 (one week), 126 (two weeks), 121 (three weeks), and 116 (four weeks) rolling estimates for each parameter respectively. Hence, the first in-sample period covers January 1, 1995 to September 16, 2004, a total of 2400 days. For both the HAR and the MIDAS regressions, the mean square errors of the RRV are almost smaller than the RV regressions, and the results are the same as the above.

4. Conclusions

In this chapter, we employ the mixed data sampling regression and the heterogeneous autoregressive regression that are able to reproduce the memory persistence observed in the data, and easy to estimate; and the realized volatility and the realized range-based volatility measures have higher persistence. Using the MIDAS models and HAR models to predict volatility with the dependent variables, which are the realized volatility and the realized range-based volatility, we expect to find the most accurate way for forecasting volatility. From the empirical results, the realized range-based variance is more efficient; in the in-sample forecasting, the relative decreasing ratio of MSE of HAR-RRV (MIDAS-RRV) regressions are almost larger than that of HAR-RV (MIDAS-RV); and in the out-of sample forecasting, the MSE of HAR-RRV and MIDAS-RRV regressions are small. The regressors consisting of the continuous sample path and jump variability measures (CJ) in the HAR and MIDAS regressions predict the future realized range volatilities, and such dominates almost in all mean square error (MSE) terms. Furthermore, the realized range-based regressions are significant for short-run volatility forecasting, but the realized return-based

regressions are almost invariant to jumps. Hence, for our empirical results, using the HAR and MIDAS regressions to predict latent volatility, under different variations, and the realized range-based variance is a good volatility proxy.

Table 2.1 Descriptive statistics of S&P 5001995/01/01 ~ 2005/03/31

RVt ^0.83360.4295 2.0464 13.1757 0.1665 5.6917 10051.2788

Ct ^0.79800.4084 1.6319 7.8198 0.1665 3.4973 10702.3322

Jt 0.0931 0.2589 5.3464 62.9823 0.0000 4.9840 28.8587

BPVt ^0.76540.4008 1.7462 8.6065 0.1678 3.5736 10980.4914 lnRV_t -0.5945 0.9596 -0.0090 3.0608 -3.5861 3.4780 12603.5280

RRVt 0.7589 1.0791 14.6190 422.6160 0.0365 35.2861 3476.9151 Ct 0.6740 0.8816 8.7098 167.0704 0.0365 22.5290 5505.4818

Jt ^0.17590.2323 2.6193 23.8600 0.0000 3.5717 300.0182

RBVt ^0.71860.3623 1.9040 12.2177 0.1933 4.7465 12177.7777 lnRRV_t -0.7210 0.9379 0.0094 2.8980 -3.3103 3.5635 14373.9498 lnC_t -0.8478 0.9473 0.0517 2.7497 -3.3103 3.1148 14842.8362 ln(J_t+1) 0.0686 0.1348 5.7459 69.4079 0.0000 2.6216 335.6110 lnRBV_t -0.8842 0.9429 0.0535 2.7622 -3.2871 3.1148 14924.9842 Note: The table shows that the S&P 500 cash index securities cover the period from January 1, 1995 to March 31, 2005, consisting of 2535 days with 78 intra-day 5-minute observations. Panel A represents the descriptive statistics of the realized return-based variations. Panel B represents the descriptive statistics of the realized range-based variations. LB reports the Liung-Box test statistic for up to the tenth order 10

serial correlation. RV denotes the realized variance, t RRV is the realized range-based variance; t C ist

the continuous part, and J is the jump part of _t RV (or_t RRV ) as separated by the bipower jump test of_t Barndorff-Nielsen and Shephard (2004a). The bipower jump tests a significant level atα=0.999; and the critical value of LB is 18.3070. 10 BVP RBV denotes the realized bipower (realized range bipower)t( t) variation. In Panel A, the first part describes the RV , the next describes the square root transformation,t

and the last describes the log transformations of the variables. Panel B replaces RRV to t RV . t

Table 2.2 In-sample results S&P 5001995/01/01 ~ 2005/03/31: HAR

Panel A HAR-RV HAR-RRV

Horizon RV BPV C CJ RRV RBV C CJ

MSE

1 day 0.9258 0.8394 0.8429 0.8389 0.7902 0.7545 0.7478 0.7140 (0.2358) (0.3071) (0.3042) (0.3075) (0.2677) (0.3008) (0.3070) (0.3383) 1 week 0.4256 0.4178 0.4091 0.4047 0.3198 0.3167 0.3151 0.3077

(0.6487) (0.6551) (0.6623) (0.6659) (0.7036) (0.7065) (0.7080) (0.7149) 2 weeks 0.2930 0.2684 0.2736 0.2681 0.2033 0.1897 0.1895 0.1861

(0.7581) (0.7784) (0.7741) (0.7787) (0.8116) (0.8242) (0.8244) (0.8275) 3 weeks 0.3002 0.2831 0.2931 0.2899 0.2032 0.1982 0.1979 0.1945

(0.7522) (0.7663) (0.7580) (0.7607) (0.8117) (0.8163) (0.8166) (0.8198) 4 weeks 0.3155 0.3028 0.3014 0.2855 0.2114 0.2043 0.2022 0.1997

(0.7396) (0.7500) (0.7512) (0.7643) (0.8041) (0.8107) (0.8126) (0.8149)

Panel B HAR-RV¹² HAR-RRV ¹²

Horizon RV¹² BPV ¹² C¹² (CJ)¹² RRV¹² RBV¹² C ¹² (CJ)¹² MSE

1 day 0.8964 0.8432 0.8503 0.8495 0.7560 0.7338 0.7295 0.7192 (0.2600) (0.3039) (0.2981) (0.2987) (0.2994) (0.3200) (0.3240) (0.3335) 1 week 0.4059 0.4026 0.3980 0.3960 0.3041 0.3036 0.3031 0.3004

(0.6649) (0.6677) (0.6715) (0.6731) (0.7182) (0.7187) (0.7191) (0.7216) 2 weeks 0.2878 0.2655 0.2678 0.2639 0.2022 0.1903 0.1900 0.1885

(0.7624) (0.7808) (0.7789) (0.7822) (0.8126) (0.8236) (0.8239) (0.8253) 3 weeks 0.2944 0.2810 0.2896 0.2869 0.1998 0.1972 0.1960 0.1945

(0.7570) (0.7680) (0.7609) (0.7632) (0.8148) (0.8173) (0.8184) (0.8198) 4 weeks 0.3121 0.2990 0.2975 0.2841 0.2091 0.2023 0.2006 0.1952

(0.7424) (0.7532) (0.7544) (0.7655) (0.8062) (0.8125) (0.8141) (0.8191)

Panel C HAR- ln RV HAR- ln RRV

Horizon ln RV ln BPV ln C ln(CJ) lnRRV lnRBV lnC ln(CJ ) MSE

1 day 0.9112 0.8797 0.8898 0.8887 0.7507 0.7364 0.7351 0.7302 (0.2478) (0.2738) (0.2655) (0.2664) (0.3043) (0.3176) (0.3188) (0.3233) 1 week 0.4068 0.3981 0.3998 0.3952 0.3023 0.2998 0.3000 0.2958

(0.6642) (0.6714) (0.6700) (0.6738) (0.7199) (0.7222) (0.7220) (0.7259) 2 weeks 0.3040 0.2893 0.2875 0.2818 0.2129 0.2035 0.2029 0.2040

(0.7491) (0.7612) (0.7627) (0.7674) (0.8027) (0.8114) (0.8120) (0.8110) 3 weeks 0.3026 0.2928 0.2998 0.2949 0.2034 0.2020 0.2005 0.1983

(0.7502) (0.7583) (0.7525) (0.7566) (0.8115) (0.8128) (0.8142) (0.8162) 4 weeks 0.3222 0.3118 0.3080 0.2866 0.2168 0.2111 0.2091 0.2084

(0.7340) (0.7426) (0.7457) (0.7634) (0.7991) (0.8044) (0.8062) (0.8069) Note: The table represents MSE of the equations (17) – (24) for one day, one week through four weeks in-sample predictions of the HAR regressions of RV and RRV of S&P 500 cash index from 1995/01/01 to 2005/03/31. The different columns represent the use of different regressors. RV denotes the realized variance, RRV denotes the realized range-based variance, BVP RBV_t ( _t) denotes the realized bipower (realized range bipower) variation, C denotes the continuous part of RV (RRV) as determined by the bipower test. (CJ) denotes the continuous part and the square root of the jump part that are used as separate regressors. Panel B is the model of the standard deviation. Panel C is the model of the log form. In the bipower test to separate RV (RRV) into C and J, the significant level α=0.999 was used. On the left side, the dependent variable is the RV for all horizons;

and on the right side, the dependent variable is the RRV. The related decreasing ratios of MSE are in parenthesis.

Table 2.3 In-sample results S&P 5001995/01/01 ~ 2005/03/31: MIDAS

Panel A MIDAS-RV MIDAS-RRV

Horizon RV BPV C CJ RRV RBV C CJ

MSE

1 day 0.9306 0.8463 0.8491 0.8449 0.7945 0.7593 0.7530 0.7197 (0.2318) (0.3014) (0.2991) (0.3025) (0.2637) (0.2964) (0.3022) (0.3331) 1 week 0.4248 0.4104 0.4026 0.4020 0.3207 0.3157 0.3141 0.3120

(0.6493) (0.6612) (0.6677) (0.6682) (0.7028) (0.7074) (0.7089) (0.7109) 2 weeks 0.2832 0.2611 0.2635 0.2583 0.1995 0.1883 0.1874 0.1849

(0.7662) (0.7845) (0.7825) (0.7868) (0.8151) (0.8255) (0.8263) (0.8287) 3 weeks 0.2905 0.2752 0.2837 0.2836 0.1994 0.1958 0.1950 0.1947

(0.7602) (0.7728) (0.7658) (0.7659) (0.8152) (0.8186) (0.8193) (0.8196) 4 weeks 0.3075 0.2932 0.2929 0.2925 0.2053 0.1976 0.1958 0.1956

(0.7462) (0.7580) (0.7582) (0.7585) (0.8097) (0.8169) (0.8186) (0.8187)

Panel B MIDAS-RV¹² MIDAS-RRV ¹²

Horizon RV¹² BPV ¹² C¹² (CJ)¹² RRV¹² RBV¹² C ¹² (CJ)¹² MSE

1 day 0.9007 0.8485 0.8545 0.8536 0.7599 0.7382 0.7342 0.7258 (0.2565) (0.2996) (0.2946) (0.2954) (0.2958) (0.3159) (0.3196) (0.3274) 1 week 0.4035 0.3970 0.3933 0.3931 0.3030 0.3015 0.3012 0.3009

(0.6669) (0.6723) (0.6753) (0.6755) (0.7192) (0.7206) (0.7209) (0.7212) 2 weeks 0.2806 0.2600 0.2604 0.2555 0.1977 0.1884 0.1877 0.1869

(0.7684) (0.7854) (0.7850) (0.7891) (0.8168) (0.8254) (0.8261) (0.8268) 3 weeks 0.2902 0.2773 0.2854 0.2855 0.1977 0.1954 0.1938 0.1902

(0.7604) (0.7711) (0.7644) (0.7643) (0.8168) (0.8189) (0.8204) (0.8237) 4 weeks 0.3057 0.2883 0.2890 0.2771 0.2044 0.1976 0.1965 0.1949

(0.7476) (0.7620) (0.7614) (0.7713) (0.8106) (0.8169) (0.8179) (0.8194)

Panel C MIDAS- lnRV MIDAS- lnRRV

Horizon lnRV lnBPV ln C ln(CJ) lnRRV lnRBV lnC ln(CJ ) MSE

1 day 0.9130 0.8814 0.8907 0.8893 0.7539 0.7399 0.7387 0.7351 (0.2463) (0.2724) (0.2647) (0.2659) (0.3014) (0.3143) (0.3154) (0.3188) 1 week 0.4039 0.3935 0.3958 0.3898 0.3010 0.2982 0.2984 0.2958

(0.6666) (0.6752) (0.6733) (0.6782) (0.7211) (0.7237) (0.7235) (0.7259) 2 weeks 0.3048 0.2900 0.2887 0.2835 0.2118 0.2034 0.2029 0.1986

(0.7484) (0.7606) (0.7617) (0.7660) (0.8037) (0.8115) (0.8120) (0.8160) 3 weeks 0.3016 0.2932 0.3014 0.3023 0.2041 0.2035 0.2025 0.1997

(0.7510) (0.7580) (0.7512) (0.7505) (0.8109) (0.8114) (0.8123) (0.8149) 4 weeks 0.3288 0.3189 0.3152 0.2804 0.2245 0.2179 0.2164 0.1953

(0.7286) (0.7368) (0.7398) (0.7685) (0.7920) (0.7981) (0.7995) (0.8190) Note: The table represents MSE of the equations (17) – (24) for one day, one week through four weeks in-sample predictions of the MIDAS regressions of RV and RRV of S&P 500 cash index from 1995/01/01 to 2005/03/31. The different columns represent the use of different regressors. The related decreasing ratios of MSE are in parenthesis.

See Table 2 for further details.

Table 2.4 In-sample results S&P 5001995/01/01 ~ 2005/03/31 Chow test for conditional HAR regressions

Panel A HAR-RV HAR-RRV 1 week 1.8256 2.3656 2.1403 14.3689 15.5310 15.2141

(0.1226 ) (0.0520 ) (0.0747 ) (0.0000 ) (0.0000 ) (0.0000 ) 2 weeks 2.7435 0.5548 0.5413 3.6889 1.8788 1.6905

(0.0292 ) (0.6957 ) (0.7055 ) (0.0062 ) (0.1148 ) (0.1528 ) 3 weeks 0.7533 0.2970 0.3210 1.0382 0.6336 0.4908

(0.5572 ) (0.8796 ) (0.8636 ) (0.3893 ) (0.6393 ) (0.7425 ) 4 weeks 0.6198 0.1298 0.1385 1.7564 0.9276 0.8743

(0.6493 ) (0.9713 ) (0.9677 ) (0.1423 ) (0.4505 ) (0.4817 ) 1 week 1.6470 0.9922 0.9285 7.2790 6.3360 6.1991

(0.1612 ) (0.4113 ) (0.4470 ) (0.0000 ) (0.0001 ) (0.0001 ) 2 weeks 2.4738 0.4077 0.3580 4.1786 1.7955 1.3640

(0.0451 ) (0.8031 ) (0.8384 ) (0.0027 ) (0.1303 ) (0.2470 ) 3 weeks 0.9541 0.2664 0.2282 0.6741 0.0904 0.0475

(0.4345 ) (0.8992 ) (0.9223 ) (0.6109 ) (0.9854 ) (0.9957 ) 4 weeks 0.7671 0.1276 0.1057 1.9653 0.9361 0.8101

(0.5487 ) (0.9722 ) (0.9803 ) (0.1043 ) (0.4457 ) (0.5211 )

Panel C HAR- lnRV HAR- lnRRV

Horizon lnRV ln BPV lnC lnRRV lnRBV lnC

and F−stat p−value

1 day 16.4395 1.1144 0.9940 15.3766 2.8791 1.7581 (0.0000 ) (0.3479 ) (0.4095 ) (0.0000 ) (0.0215 ) (0.1345 ) 1 week 1.5409 0.0340 0.0445 4.5171 2.3127 2.0388

(0.1891 ) (0.9978 ) (0.9963 ) (0.0014 ) (0.0567 ) (0.0878 ) 2 weeks 2.0553 0.4506 0.4645 3.6827 1.4817 0.9904

(0.0873 ) (0.7719 ) (0.7618 ) (0.0062 ) (0.2083 ) (0.4134 ) 3 weeks 1.3271 0.2565 0.1589 1.5545 0.6619 0.5296

(0.2622 ) (0.9053 ) (0.9587 ) (0.1891 ) (0.6194 ) (0.7142 ) 4 weeks 0.6590 0.0862 0.0473 1.6982 0.8170 0.6134

(0.6217 ) (0.9866 ) (0.9957 ) (0.1551 ) (0.5168 ) (0.6538 ) Note: The table is the Chow test of the F-statistics and p−value for the test of the hypothesis that the jump dummies are zero for one day, one week through four weeks in-sample predictions of the conditional HAR regression for RV and RRV of S&P 500 cash index from 1995/01/01 to 2005/03/31. Thep−valueare in parenthesis.

See Table 2 for further details.

Table 2.5 In-sample results S&P 5001995/01/01 ~ 2005/03/31 Using conditional HAR regressions

Panel A HAR-RV HAR-RRV

Horizon RV BPV C RRV RBV C

MSE

1 day 0.8540 0.8371 0.8415 0.7902 0.7545 0.7478 (0.2950) (0.3090) (0.3053) (0.2677) (0.3008) (0.3070) 1 week 0.4194 0.4100 0.4021 0.3198 0.3167 0.3151

(0.6538) (0.6615) (0.6681) (0.7036) (0.7065) (0.7080) 2 weeks 0.2803 0.2660 0.2712 0.2033 0.1897 0.1895

(0.7686) (0.7804) (0.7761) (0.8116) (0.8242) (0.8244) 3 weeks 0.2946 0.2810 0.2908 0.2032 0.1982 0.1979

(0.7568) (0.7680) (0.7599) (0.8117) (0.8163) (0.8166) 4 weeks 0.3089 0.3015 0.2999 0.2114 0.2043 0.2022

(0.7450) (0.7511) (0.7524) (0.8041) (0.8107) (0.8126)

Panel B HAR-RV¹² HAR-RRV¹²

Horizon RV ¹² BPV¹² C¹² RRV¹² RBV¹² C ¹² MSE

1 day 0.8964 0.8432 0.8503 0.7560 0.7338 0.7295 (0.2600) (0.3039) (0.2981) (0.2994) (0.3200) (0.3240) 1 week 0.4059 0.4026 0.3980 0.3041 0.3036 0.3031

(0.6649) (0.6677) (0.6715) (0.7182) (0.7187) (0.7191) 2 weeks 0.2878 0.2655 0.2678 0.2022 0.1903 0.1900

(0.7624) (0.7808) (0.7789) (0.8126) (0.8236) (0.8239) 3 weeks 0.2944 0.2810 0.2896 0.1998 0.1972 0.1960

(0.7570) (0.7680) (0.7609) (0.8148) (0.8173) (0.8184) 4 weeks 0.3121 0.2990 0.2975 0.2091 0.2023 0.2006

(0.7424) (0.7532) (0.7544) (0.8062) (0.8125) (0.8141)

Panel C HAR- lnRV HAR- lnRRV

Horizon lnRV ln BPV lnC lnRRV lnRBV lnC MSE

1 day 0.9112 0.8797 0.8898 0.7507 0.7364 0.7351 (0.2478) (0.2738) (0.2655) (0.3043) (0.3176) (0.3188) 1 week 0.4068 0.3981 0.3998 0.3023 0.2998 0.3000

(0.6642) (0.6714) (0.6700) (0.7199) (0.7222) (0.7220) 2 weeks 0.3040 0.2893 0.2875 0.2129 0.2035 0.2029

(0.7491) (0.7612) (0.7627) (0.8027) (0.8114) (0.8120) 3 weeks 0.3026 0.2928 0.2998 0.2034 0.2020 0.2005

(0.7502) (0.7583) (0.7525) (0.8115) (0.8128) (0.8142) 4 weeks 0.3222 0.3118 0.3080 0.2168 0.2111 0.2091

(0.7340) (0.7426) (0.7457) (0.7991) (0.8044) (0.8062) Note: The table represents MSE of the equations (29) for one day, one week

through four weeks in-sample predictions of the HAR regressions of RV and RRV of S&P 500 cash index from 1995/01/01 to 2005/03/31. The related decreasing ratios of MSE are in parenthesis.

See Table 2 for further details.

Table 2.6 Out-of-sample forecasts of S&P 5001995/01/01 ~ 2005/03/31: HAR

Panel A HAR-RV HAR-RRV

Horizon RV BPV C CJ RRV RBV C CJ

MSE

1 day 0.0295 0.0247 0.0221 0.0217 0.0187 0.0169 0.0167 0.0187 1 week 0.0246 0.0203 0.0157 0.0162 0.0155 0.0149 0.0147 0.0171 2 weeks 0.0295 0.0246 0.0203 0.0194 0.0176 0.0169 0.0164 0.0183 3 weeks 0.0470 0.0374 0.0313 0.0293 0.0275 0.0253 0.0248 0.0304 4 weeks 0.0502 0.0396 0.0358 0.0321 0.0287 0.0270 0.0263 0.0274

Panel B HAR-RV¹² HAR-RRV ¹²

Horizon RV¹² BPV ¹² C¹² (CJ)¹² RRV¹² RBV¹² C ¹² (CJ)¹² MSE

1 day 0.0214 0.0198 0.0193 0.0193 0.0128 0.0122 0.0122 0.0126 1 week 0.0103 0.0088 0.0079 0.0083 0.0078 0.0078 0.0078 0.0080 2 weeks 0.0104 0.0088 0.0078 0.0079 0.0081 0.0084 0.0084 0.0085 3 weeks 0.0168 0.0124 0.0111 0.0113 0.0120 0.0113 0.0113 0.0123 4 weeks 0.0167 0.0144 0.0134 0.0132 0.0118 0.0122 0.0122 0.0131

Panel C HAR- lnRV HAR- lnRRV

Horizon lnRV lnBPV ln C ln(CJ) ln RRV ln RBV ln C ln(CJ ) MSE

1 day 0.0210 0.0199 0.0195 0.0195 0.0128 0.0122 0.0122 0.0122 1 week 0.0083 0.0078 0.0072 0.0074 0.0068 0.0070 0.0071 0.0072 2 weeks 0.0072 0.0071 0.0065 0.0064 0.0065 0.0071 0.0073 0.0076 3 weeks 0.0089 0.0081 0.0074 0.0072 0.0077 0.0081 0.0081 0.0088 4 weeks 0.0097 0.0097 0.0090 0.0090 0.0081 0.0090 0.0092 0.0100 Note: The table represents MSE, for one day, one week through four weeks in-sample predictions of the HAR

regressions of RV and RRV, of the out-of-sample forecasts of the S&P 500 cash index from September 16, 2004 to March 31, 2005. Data from January 1, 1995 to March 31, 2005 was used to estimate the parameters of the models.

See Table 2 for further details.

Table 2.7 Out-of-sample forecasts of S&P 5001995/01/01 ~ 2005/03/31: MIDAS

Panel A MIDAS-RV MIDAS-RRV

Horizon RV BPV C CJ RRV RBV C CJ

MSE

1 day 0.0256 0.0273 0.0325 0.0219 0.0191 0.0217 0.0214 0.0186 1 week 0.0093 0.0064 0.0083 0.0169 0.0091 0.0091 0.0088 0.0168 2 weeks 0.0097 0.0051 0.0056 0.0187 0.0078 0.0071 0.0070 0.0193 3 weeks 0.0254 0.0062 0.0044 0.0298 0.0097 0.0040 0.0040 0.0297 4 weeks 0.0146 0.0066 0.0049 0.0331 0.0048 0.0039 0.0039 0.0270

Panel B MIDAS-RV¹² MIDAS-RRV ¹²

Horizon RV¹² BPV ¹² C¹² (CJ)¹² RRV¹² RBV¹² C ¹² (CJ)¹² MSE

1 day 0.0215 0.0198 0.0193 0.0193 0.0129 0.0122 0.0123 0.0127 1 week 0.0106 0.0091 0.0082 0.0082 0.0079 0.0081 0.0081 0.0085 2 weeks 0.0113 0.0096 0.0087 0.0086 0.0086 0.0090 0.0090 0.0095 3 weeks 0.0193 0.0137 0.0125 0.0124 0.0136 0.0124 0.0124 0.0131 4 weeks 0.0183 0.0162 0.0155 0.0150 0.0129 0.0132 0.0132 0.0153

Panel C MIDAS- ln RV MIDAS- ln RRV

Horizon ln RV ln BPV ln C ln(CJ) ln RRV ln RBV ln C ln(CJ ) MSE

1 day 0.0209 0.0199 0.0194 0.0195 0.0128 0.0122 0.0123 0.0122 1 week 0.0085 0.0080 0.0074 0.0075 0.0070 0.0073 0.0074 0.0076 2 weeks 0.0078 0.0076 0.0071 0.0069 0.0071 0.0078 0.0080 0.0084 3 weeks 0.0096 0.0088 0.0082 0.0077 0.0084 0.0089 0.0090 0.0097 4 weeks 0.0106 0.0103 0.0099 0.0087 0.0089 0.0096 0.0097 0.0106 Note: The table represents MSE, for one day, one week through four weeks in-sample predictions of the MIDAS regressions of RV and RRV, of the out-of-sample forecasts of the S&P 500 cash index from September 16, 2004 to March 31, 2005. Data from January 1, 1995 to March 31, 2005 was used to estimate the parameters of the models.

See Tables 2 for further details.

Chapter 3 The Information Content of Implied Volatility in the presence of the Continuous Components,

and the Jump Components of Realized Range Volatility

1 Introduction

Most of the previous studies have documented the information content of the implied volatility. They always focus on whether the implied volatility has the additional information content of historical volatility; the realized volatility (RV) is always used as the historical volatility. Giot and Laurent (2007) considered the information content of implied volatility in the continuous and jump components of the realized volatility, whose decomposition was suggested by Barndroff-Nielsen and Shephard (2004), using the encompassing regressions. Because the realized range-based estimation of the integrated variance has been proved to be more efficient, we will use the realized range-based volatility to measure the historical volatility. We will employ the heterogeneous autoregressive (HAR) regressions by Corsi (2004) and mixed data sampling (MIDAS) regressions by Ghysels et al. (2006) as encompassing regressions to examine the information content of the continuous and jump components of the realized range-based volatility (RRV), and the additional information content of the implied volatility as an additional regressor. In addition, this study focuses on the S&P 500 index, hence, we use the Chicago Board Options Exchange (CBOE) volatility index new VIX as the measure of the implied volatility. The new VIX is based on S&P 500 index options and adopts the model-free volatility expectation.

The results show that the implied volatility and almost all continuous

components are statistically significant, while the jump components are almost not significant. The implied volatility has a high information content and the continuous components of the past realized range-based volatility feature relevant information content by the implied volatility. Beside, the jump components do not contribute to future valuable information.

In addition, except for h=1 horizon, the implied volatility and the out-of-sample volatility have information contents but the implied volatility has more powerful explanation abilities than the out-of-sample volatility for the future realized range volatility.

The remainder of this chapter is organized as follows. In section 2, we discuss the volatility measure, predict volatility regressions and cover the models we are going to use. In section 3, we present the data and the empirical results. Section 4 concludes the article.

2 The Methodology

2.1 Construction of volatility measures

Let the logarithmic price of financial assets at time t be denoted by ( )p t and follow the continuous-time jump diffusion process

( ) ( ) ( ) ( ) ( ) ( ) sequence of partitions is defined by

When the data is sampled at a higher frequency, M times in a day, we will denote the intraday ranges as:

{ }

The realized ranged-based variance over day t is defined as ^* ² _{, ,} noted by Christensen and Podolskij (2006a, 2006b),

* 2 2

i e, RRV_t^m^* is inconsistent. Hence, they modified the intraday high-low statistic to make it consistent with the quadratic variation. The realized range-based bipower variation with parameter ( , )r s ∈R₊² is defined as⁶: method, Christensen and Podolskij (2006b) found the jump detection statistic,

5 There is no explicit formula for λ , but it is computed to any degree of accuracy from simulations. _{r m}_,

6 I maintain some notations used by Christensen and Podolskij (2006b) throughout the chapter.

where ν_m =λ_2,²_m(Λ + Λ − Λ^R_m ^B_m 2 ^RB_m ),

Λ = − . In addition, they adopted the modified ratio-statistic to

improve the size properties in finite samples. The modified ratio-statistic is

( )

Huang and Tauchen (2005) found that the statistics in equation (3) also had a sensible power against other empirically calibrated stochastic volatility jump diffusion models.

Using equation (3), Andersen, Bollerslev, and Diebold (2007) identified the jump variation as, and the continuous component variation was estimated as the residual,

, 1 , 1 will use α =0.999 throughout the article. From the definitions in equations (4) and (5), we ensure that the continuous variation and jump variation sum to the total realized variation, . .i e

In this study, we will exploit MIDAS regression model, which was introduced by Ghysels et al. (2002, 2006), and HAR regression model, which was suggested by

Corsi (2004), to predict volatility from t to t+H , where H is the predicting horizon in days.

The multi-period realized variances that were constructed by Andersen et al.

(2007). Similarly, the multi-period realized range-based varinaces were defined as the normalized sum of the one-period realized range variances,

, ( , 1 , 2 , )

t t H t t t t t t H

RRV ₊ =H⁻ RRV ₊ +RRV ₊ + +RRV ₊ ,

where 1, H = 5, 10, 15, and 20 . H is the prediction horizon in days, in the empirical analysis, as one day, weekly, bi-weekly, tri-weekly, and monthly.

The HAR-RV models were introduced by Corsi (2004) and they can capture the long memory property of the realized variance. Similarly, the HAR-RRV model is written as follows

, 0 1, 5, 20, , 1

t t H D t t W t t M t t t t

RRV ₊ =α α+ X₋ +α X₋ +α X₋ +ε ₊ , where RRV_{t t H}_,+ represents the future RRV, using the HAR regressions.

Andersen et al. (2007) defined the HAR-RV-CJ model, which explores the separation of RV into the continuous part _t C and jump part _t J . This separation _t was suggested by Barndorff-Nielsen et al. (2004). Following Andersen et al.’s model, the HAR-RRV-CJ is shown as below

1, 5, 20,

As noted by Andersen et al. (2001) , the log form probability density of the error term is close to the normal density, we will consider the HAR- ln RRV- ln CJ model in this chapter. The models are as follows,

, 0 1, 5, 20,

The differences between MIDAS and HAR regressions models are the lagged regressors and their weights. The MIDAS regression was introduced by Ghysels et al.

(2002, 2005). MIDAS regressions can run parsimoniously parameterized regressors of data the observed at different frequencies. Ghysels et al. (2006) used the MIDAS regressions to predict volatility and we follow him. The MIDAS-RRV models can be written as

max

, ^k 0 ( , ,1 2) 1,

t t H H H k t k t k t H

RRV ₊ =μ +φ

∑

₌ b k θ θ X_{− − −} +ε₊ , where RRV_{t t H}_,+ represents the future RRV, using the MIDAS regressions.

RRV ₊ =μ +φ

∑

₌ b k θ θ X_{− − −} +ε₊ , where RRV_{t t H}_,+ represents the future RRV, using the MIDAS regressions.

在文檔中財務市場波動及基金績效之計量分析 (頁 24-0)