Temporal Aggregation of a Strong PGARCH(1,1) Process

28  Download (0)

Full text

(1)

Temporal Aggregation of a Strong PGARCH(1,1) Process

Meng-Feng Yen

Department of Finance, Chaoyang University of Technology Tel: 04-23323000 ext 4601

E-Mail: yenmf@mail.cyut.edu.tw

Abstract

Bollerslev’s (1986) standard GARCH(1,1) model has been successful in the literature of volatility modelling and forecasting in the past two decades. Many of its extensions are contributed to examine the stylized features often observed with financial asset data. One of the distinct success is Bollerslev and Ghysels’ (1996) periodic GARCH model, which takes into account periodic variation in the volatility of the underlying process. However, Drost and Nijman (1993) find that the conventional GARCH formulation works for only one sampling interval arbitrarily decided for the data in hand. This formulation does not apply to any other time intervals due to the assumption of an i.i.d. probability assumption for the underlying data. One of the problems caused by this will be that we cannot use ML method to estimate the GARCH model if the model is not for the original data set, but rather, for its temporally aggregated or dis-aggregated counterpart. Dorst and Nijman (1993) introduce the so-called weak GARCH formulation to tackle this problem and find this form of GARCH models apply to all sampling intervals for any given set of data. However, the ML method does not apply to the weak form of GARCH models since this formulation does not assume any probability distribution for the underlying standardised innovations. They thus propose a set of formulae to map the parameters of a weak GARCH(1,1) process sampled at one time interval to those of the same process but sampled at any other time interval.

However, there is hitherto no analytical results for a weak PGARCH process. It is the main purpose of the paper to investigate the relationship amongst the parameters of a weak GARCH process before and after temporal aggregation. Our simulation results tend to suggest that a two-stage PGARCH process will aggregate into a weak GARCH process. Some analytical results about the aggregated process are introduced too.

Key words: strong- and weak-GARCH, temporal aggregation, Monte Carlo simulation

The Authors gratefully acknowledge the helpful comments of two anonymous referees.

(2)

1. Introduction

The standard GARCH(1,1) model introduced by Bollerslev (1986) has witnessed its success in the literature of volatility modelling and forecasting. Voluminous studies have been committed to this model’s application in empirical circumstances. Many of its extensions involve modelling stylised features of the financial time series. A pronounced one of them is intraday volatility pattern, or called periodicity in volatility as a general term. Failure to accommodate this feature into the standard GARCH(1,1) model will absolutely lead to a model mis-specification problem.

Bollerslev and Ghysels (1996, hereafter BG), amongst others, introduce their periodic GARCH model (hereafter PGARCH) to tackle this problem. However, Drost and Nijman (1993, hereafter DN) find that the formulation of the standard GARCH(1,1) model proposed by Bollerslev (1986) is applicable to only one sampling frequency of any data. To be specific, if we assume Bollerslev (1986)’s standard GARCH(1,1) model for a set of hourly data, the aggregated daily or weekly data will no longer follow this model. As a result, we can not use (quasi-) maximum likelihood method to estimate the standard GARCH(1,1) model for the aggregated daily or weekly data. DN suggest that the reason why Bollerslev’s GARCH formulation does not apply to all time spans is the assumption of an i.i.d. probability distribution for the standardised errors. They therefore suggest a weak form of the standard GARCH model and a set of formulae mapping the parameters of the model sampled at any two different frequencies.

1.1 Review of the Drost and Nijman Aggregation Theory

Based on the continuous-time diffusion limit of the GARCH(1,1) process developed by Nelson (1990), DN (1993) propose a theoretical framework for the temporal aggregation of weak GARCH processes. Assuming that there is an underlying GARCH(1,1) diffusion, they derive a set of formulae for parameter mapping between any two discrete GARCH(1,1) models sampled at different frequencies. For example, if one has a series of daily data and desires to know what a weekly GARCH(1,1) model looks like, a natural approach to achieving this is to simply aggregate the daily data into weekly data and estimate them directly. Alternatively, one may first estimate the daily data to get the daily GARCH(1,1) estimates. Substitution of the daily estimates into the DN aggregation formulae gives us the weekly GARCH(1,1) parameters.

Before introducing the DN temporal aggregation theory, it is necessary to understand the three types of GARCH(p,q) processes, one of which underlies the theory. The definitions of the three GARCH(p,q) processes are described as follows: Given that

} 2

1 ) ( { )

(L ht A L t

B =φ+ − ε , (1a)

where

= = −

=

+

= qi iLi B L ip iLi A(L) 1 1α , ( ) 1 1β ,

and

ip=1βi +

qi=1αi <1, (1b) εt is said to be

1. a strong GARCH(p,q) process if ztt /(ht)1/2 is i.i.d. with zero mean and unit variance;

2. a semi-strong GARCH(p,q) process if the conditional expectation of εt and εt2 upon the lags of εt are equal to, respectively, 0 and h ; and t

3. a weak GARCH(p,q) process if the best linear predictor of εtand εt2 in terms of 1, εt1, εt2,…,

2

1

εt , εt22,…, are equal, respectively, to 0 and h . t

Note that both strong and semi-strong GARCH processes also satisfy the definition of a weak GARCH process, whereas a strong GARCH process is also a semi-strong GARCH process.

Of the three definitions above, only the weak GARCH process is closed under temporal

(3)

aggregation. In other words, a weak GARCH process remains a weak GARCH process under temporal aggregation. On the contrary, (semi-) strong GARCH processes will no longer be (semi-) strong GARCH processes, but rather weak GARCH processes, under temporal aggregation. An implication of this is that the conventional treatment of volatility modelling by the (semi-) strong GARCH models is not appropriate, as these models will be valid at only one frequency.

Unfortunately, the weak GARCH type of models is dealing with the best linear projection of the underlying process, rather than its conditional expectation that has been regarded as the subject under investigation in the literature of GARCH type of models. By definition, however, the best linear projection of any variable upon a given information set is equal in size to its conditional expectation upon the same information set. The weak GARCH class of models is thus still useful for modelling and forecasting conditional heteroscedasticity.

Within the framework of weak GARCH processes, DN derive a set of formulae for mapping the parameters of a GARCH(1,1) model sampled at any temporal frequency onto the parameters of another GARCH(1,1) model sampled at another temporal frequency. To illustrate, let

2

1 1 1 1

H H H

t t t

h =φ +α ε h denote the best linear projection equation and k the unconditional H kurtosis of the high-frequency weak GARCH(1,1) process εt , and let

hλ = φ(Lm) + α ε1( )Lm λ21 + β1( )Lmhλ1 denote the best linear projection equation and k(Lm) the unconditional kurtosis of the aggregated low-frequency weak GARCH(1,1) processελ, m being the number of high-frequency intervals that each low-frequency interval contains. Then,

( )Lm f m1( , H, 1H, 1H)

φ = φ α β , (2a)

1( )Lm f m2( , 1H, 1H,kH)

β = α β , (2b)

1( )Lm f m3( , 1H, 1H, ( )Lm )

α = α β β , (2c)

( )Lm 4( , 1H, 1H, H)

k = f mα β k , (2d)

Refer to DN (1993) for the detailed formulae.

It is worth noting that kH , the unconditional kurtosis of the high-frequency weak GARCH(1,1) process εt, appears explicitly in (2b) and (2d), and implicitly in (2c). The value of

k is therefore required for the set of formulae to work, which does not apply to the temporal H

aggregation of ARMA models1. To ensure non-negativity for k , moreover, the sum of H α1H and

1

βH must fall inside the unit circle and 1 (− α1H1H)2 must be larger than (kξH −1)(α1H)2, where kξH denotes the kurtosis of the high-frequency standardised strong GARCH(1,1) process. Also note that the aggregation frequency m is also allowed to be smaller than 1, such a case constituting the dis-aggregation case. In other words, the implied parameters constitute a higher-frequency GARCH(1,1) model. Given a weak GARCH(1,1) model sampled at any frequency, therefore, one can derive any other weak GARCH(1,1) model sampled at another frequency without having to estimate it.

1.2 Intended contribution of the paper

Having reviewed DN’s temporal aggregation theory for the standard GARCH model, it is one our main concerns in this paper to find out if there is any relationship between the parameters for a PGARCH(1,1) process before and after temporal aggregation. In particular, we will examine, via Monte Carlo simulations, what would happen if the standard GARCH(1,1) formulation is

1 But the temporal aggregation of ARMA models requires the existence of second moments. For the results of the temporal aggregation of ARMA type of models, see e.g. Palm and Nijman (1984) and Nijman and Palm (1990a, b).

(4)

replaced by BG’s PGARCH(1,1) model. Intuitively, if we aggregate a set of data given by a PGARCH(1,1) model into relatively lower-frequency intervals, the aggregation interval being equal to the lower-frequency interval in length, the volatility of the aggregated data should be free of periodicity. In other words, it is very likely that the aggregated data be characterised by the standard GARCH(1,1) formulation. The main thrust of this paper is therefore to uncover the volatility mechanism of a PGARCH process upon temporal aggregation. We adopt BG’s PGARCH(1,1) model with a simple two-stage periodic variation in either the intercept or the alpha parameter. The remainder of the paper is organised as follows. Section 2 explains the simulation framework and section 3 discusses the simulation results. Section 4 concludes.

2. Monte Carlo Simulation Framework

Similar to the simulation framework in BG2, the basic GARCH(1,1) model is characterised by Parameterisation 1 in Table 1 below. Parameterisation 1 is then varied to become Parameterisations 2 and 3 in Table 1 so as to show the shift in intercept (φH) or parameter α1H across the two stages of each periodic volatility cycle. Parameterisations 1 and 2 in Table 1 are used to mark the change in intercept (φH) of the GARCH(1,1) model, whereas Parameterisations 1 and 3 in Table 1 are employed to specify the shift in parameter α1H across the two stages of each volatility cycle.

Table 1: Parameterisations of the GARCH(1,1) model for the DGPs

Notes: 1. kξH denotes the unconditional kurtosis of the standardised innovation ξt. 2. H)2 denotes the unconditional variance of εt, implied by φH, α1H,

and β1H. Namely, 2

1 1

( )

(1 )

H H

H H

σ φ

α β

= − − .

The actual models used in the DGPs for high-frequency observations are therefore given as follows.

2.1 DGPs for high-frequency data

2 In BG’’s (1996) simulations for a periodic change in parameter α1Hof their PGARCH(1,1) model, they let α11H = 0.4666 and α12H= 0.0727, whilst φHand β1Hare fixed at, respectively, 0.05 and 0.7.

Parameterisations

1 2 3

φH 0.05 0.01 0.05

H

α1 0.15 0.15 0.05

H

β1 0.7 0.7 0.7

H

α1 +β1H 0.85 0.85 0.75

)2

H 0.3333 0.0667 0.2

ξt~ t5 25 25 25

kξH

ξt~ N (0,1) 3 3 3

(5)

High-Frequency observations (in-sample size: 4960, out-of-sample size: 40) Conditional mean (zero mean):

y =t εt3t ht , for t = 1 to 5100 (3) where ξt~ i.i.d. N(0,1)4 or t55,6, h denotes the conditional variance of the innovation t εt and follows a PGARCH(1,1) model given by (4a) and (4b) below.

DGP 1: PGARCH(1,1) of two-stage periodicity in intercept

2

( ) 0.15 1 0.7 1

H

t s t t t

h =φ + ε + h for t = 1 to 5000, where (4a) s(t) = 1 for odd t and 2 for even t, and

1

φH = 0.05

2

φH = 0.01.

DGP 2: PGARCH(1,1) of two-stage periodicity in alpha

2

1 ( ) 1 1

0.05 H 0.7

t s t t t

h = +α ε + h for t = 1 to 5000, where (4b) s(t) = 1 for odd t and 2 for even t, and

11

αH = 0.15

12

αH = 0.05.

The parameters in (4a) and (4b) are selected in according to the common fact that the estimate of the lagged conditional variance dominates the estimate of the squared lagged return innovations for daily or intra-daily data. Moreover, these parameters satisfy the requirements for DN’s aggregation formulae to work for both stages.

Given the high-frequency observations, we aggregate them into relatively low-frequency observations, the aggregation interval being equal to two consecutive high-frequency intervals in length to avoid any aliasing problem.

Low-Frequency Observations (in-sample size: 2480, out-of-sample size: 20)

2 2

2( 1) 2( 1)

1 1

t i i

i i

yλ y y+ λ ε+ λ ελ

= =

=

=

=

for λ =1 to 2500, and (5)

hλ =

ht= 2 2( 1) 1

i i

h+ λ

= for λ =1 to 2500, (6)

where subscripts λ and t refer, respectively, to the low-frequency time scale and the high-frequency time scale.

3 Academic research has also been committed to periodicity in the conditional mean of financial asset returns.

However, significant periodic patterns in the conditional mean of any particular financial asset would imply arbitrage opportunities, which would disappear as soon as they are perceived via intensive trading on that asset. In other words, a periodic pattern in the conditional mean would be very short-lived. We thus do not include this possibility in the mean equation here.

4 A Gaussian disturbance is used to examine whether model mis-specification or fat-tailedness in the pre-filtered data is the major contributor to the size of the fourth moment of the filtered data.

5Similar to the definitions of the three types of GARCH processes described in Section 1, assuming an i.i.d.

distribution for a PGARCH(1,1) DGP will cast it into the category of strong PGARCH processes.

6 In order to take care of the fat-tails of the financial asset return distribution often observed in empirical studies, we also employ a Student’s t5 as the driving disturbances.

(6)

According to DN (1993), strong GARCH processes will become weak GARCH processes under temporal aggregation in normal context. It is of interest to know whether a high-frequency GARCH process, characterised by a periodic pattern in the parameters, will also become a weak GARCH one upon temporal aggregation. We speculate that the answer is positive since the assumption of i.i.d. for the high-frequency strong GARCH(1,1) process might not hold under temporal aggregation.

Taking into account the choice of high-frequency driving disturbances among N(0,1) and t5

doubles the cases that are examined. Table 2 below summarises the settings of all four experiments.

Table 2: Specifications of the experiments Case No DGP No Driving

Disturbances Case No DGP No Driving Disturbances

Case 1 DGP 1 N(0,1) Case 3 DGP 1 t5

Case 2 DGP 2 N(0,1) Case 4 DGP 2 t5

2.2 Model Specification and Forecasting Procedures

The standard GARCH(1,1) model assuming constant parameters is fitted to both the high-frequency and the aggregated low-frequency observations. It is specified as follows.

strong standard GARCH(1,1) for high-frequency observations:

t t t t

y = =ε ξ σ , (7a)

where ξt~ i.i.d. standardised

tν1 or N(0,1),

2 2 2

1 1 1 1

H H H

t t t

σ =φ +α ε +β σ , and (7b)

superscript H stands for high-frequency.

strong standard GARCH(1,1) for aggregated observations:

For ease of reference, let the superscript L stand for low-frequency. Substitute L for superscript H in (7a) and (7b), and rewrite them as

yλλ =ξ σλ λ7, (8a)

where ξλ~ i.i.d. standardised

tν2 or N(0,1), and

2 2 2

(2) 1(2) 1 1(2) 1

L L L

λ λ λ

σ =φ +α ε +β σ , (8b)

where the parenthesised 2 in the subscripts denotes the number of high-frequency periods within each aggregation interval.

There are two points worth noting: i) in the case of t5 disturbances, the two numbers of degrees of freedom v1 (for the HF GARCH(1,1)) and v2 (for the LF GARCH(1,1)) are also part of the parameters under estimation by the ML method; ii) both the HF and LF GARCH(1,1) models are strong GARCH models since they both require the assumption of i.i.d. for the return innovations.

7 Having also conducted a separate set of experiments allowing for a constant term in (8a), however, we do not find the constant’s estimate to be significantly different from zero. No subsequent results are different from those from assuming zero constant term either.

(7)

implied Aggregated Weak GARCH(1,1):

Thanks to the DN aggregation theory, the aggregated weak GARCH(1,1) model is given by

λ2

ζ = φ + (2)DN α ε1(2)DN λ21 + β ζ1(2)DN λ21, (9)

where ζλ2 denotes the linear projection of the squared low-frequency innovation, ελ2, on the Hilbert space spanned by {1, ελ1λ2,…, ελ21, ελ22,…}.

2.3 Reported Summary Statistics

In Table 3 below, summary statistics will be reported including the ML estimates of the HF and LF GARCH(1,1) model parameters, the degrees of freedom of the driving disturbances (for the t5 cases), the parameters of the DN implied low-frequency GARCH(1,1) model, the unconditional variance implied by the model parameter estimates, and the unconditional kurtosis.

The data are generated under each DGP to ensure that 5000 high-frequency (pre-aggregated) observations are available for each replication. The impact of initial values is allowed for by allowing the DGP to run for 200 observations before sampling. 100008 replications, i.e. N = 10000, are completed for each simulation.

The mis-specified strong standard GARCH(1,1) filter is estimated by the quasi-maximum likelihood9 method using the BFGS algorithm, with the average of each of the parameters across the two stages in the PGARCH(1,1) DGP (for high-frequency estimation) and its corresponding DN implied parameters (for low-frequency estimation) as the initial values. WinRATS version 5.03 is used to perform the simulations and calculations on 4 PC’s of Intel PIII 733K (and faster) CPUs.

3. Results

This section discusses our simulation results for the four cases in Table 2 above. But to help understand the impacts of volatility periodicity on the ML estimates of the mis-specified strong GARCH(1,1) model, we start with examination of the model estimates under standard no-periodicity conditions. In particular, we want to know how well the ML method estimates the parameters in such a context. Section 3.1 below reports and discusses the averaged biases of the ML estimates of the strong standard GARCH(1,1) filter for both high-frequency and aggregated (low-frequency) observations from their true values given no patterns of volatility periodicity in the DGP.

3.1 Biases of the ML Parameter Estimates of the Well-Specified Strong Standard GARCH(1,1) Filters under No-Periodicity Standard Circumstances

8 We try 1000, 5000, and 10,000 replications for a few of the experiments, and suggest that 10,000 should be the minimum value to generate consistent results across different seed numbers for the t cases. Despite efficient codes and fast machines, however, the steps involved (data generation, ML estimation of the strong standard GARCH(1,1) filter for both the HF and LF observations, calculation of the aggregated weak GARCH(1,1) model, forecasting volatility and evaluation of them) for a sample size of 5000 takes a non-trivial amount of processing time given this number of replications.

9 The likelihood function for a tν distribution is formulated as follows:

1 ( 2) 1 (1 2 )

2 2 2 2 2 ( 2)

t t

t

v v ln v lnh v

t logl ln ln ln

v h

ε

+ − +

− = Γ − Γ − − − ⋅ +

− ⋅ ,where lnΓ(x) denotes the natural logarithm of Γ(x), and v the number of degrees of freedom under estimation.

(8)

We use the strong standard GARCH(1,1) filter to estimate the in-sample 4960 high-frequency observations produced by a strong standard GARCH(1,1) DGP which is purged of periodic variation in its parameters. The same strong standard GARCH(1,1) filter is then estimated on the aggregated 2480 observations, the aggregation interval being two high-frequency observation periods in length. These estimations are replicated 10000 times for each of Parameterisations 1 to 3 under both N(0,1) and t5 driving disturbances. Taken together, there are six sets of Monte Carlo simulations that lead to the results summarised in Tables A.1 and A.2 in Appendix A, which present the true values and the ML estimates of the strong standard GARCH(1,1) filter for the high-frequency and the low-frequency observations respectively.

Before detailed discussion of results in Tables A.1 and A.2 in Appendix A, it is interesting to note that the average biases of the parameter estimates are generally smaller under t5 disturbances than under N(0,1) disturbances for both the high-frequency and aggregated low-frequency observations.

Discussions of Table A.1 in Appendix A: high-frequency estimates

It is obvious from Table A.1 that, despite significant average biases of the reported high-frequency estimates, they are quite close to their true values with both sets of driving disturbances. Very few exceptions appear on the estimates of beta and the unconditional kurtosis for the t5 cases:

(a) β1H tends to be much more underestimated under Parameterisation 3 than under Parameterisation 1, irrespective of the driving disturbances. Since the only difference between Parameterisations 1 and 3 is the size of alpha, α1H being 0.15 under Parameterisation 1 and 0.05 under Parameterisation 1, it appears to suggest that beta’s estimate is inclined to be more underestimated when the true value of alpha decreases. Since alpha is correctly estimated the apparent underestimation of beta under Parameterisation 3 causes the level of integratedness,

H

α1 + β1H , to be underestimated under Parameterisation 3 to an extent larger than under Parameterisation 1.

(b) The average non-parametric sample estimate for the unconditional kurtosis of the standardized high-frequency innovations, kˆξH , tends to be significantly underestimated under t5

disturbances across all parameterisations. We doubt the possibility that estimation biases of the parameters of the standard HF GARCH(1,1) filter lead to the downward bias of kˆξH in the t5 cases.

To see why it is not likely for the biases of the parameter estimates to be the reason, we compare the biases of the parameter estimates between under both disturbances in Table A.1. In general, the biases of ˆφH , αˆ1H , and βˆ1H in the case of t5 disturbances are slightly smaller than their counterparts in the case of N(0,1) disturbances. Under both disturbances, moreover, the biases are all practically trivial, although some of them are statistically significant. This finding rules out the possibility that estimation biases of the parameters cause the under-estimation of kξH under t5

disturbances. As a matter of fact, the strong standard GARCH(1,1) filter for the high-frequency observations is correctly estimated under both disturbances. It is therefore likely that the unknown statistical property of the non-parametric sample estimate of unconditional kurtosis, i.e. ˆkξH=

[ 4

1

1 ( )ˆ

T t

T t ξ

= ]/ 1

1 ˆ

[ (

T t

T t ξ

=

ξˆt )2]2, is responsible for the under-estimation of kξH in the t5 cases even though the strong standard GARCH(1,1) filter for the high-frequency observations is correctly specified and estimated. In other words, it should be part of the unknown statistical properties of

(9)

ˆH

kξ = [ 4

1

1 T ( )ˆt T t ξ

= ]/ 1

1 ˆ

[ T ( t T t ξ

=

ξˆt )2]2 that the larger the fourth moment of the driving disturbances, the more is its non-parametric sample estimate smaller than its true value. However, verification of the above argument relies on the statistical property of [ 4

1

1 ( )ˆ

T t

T t ξ

= ]/ 1

1 ˆ

[ (

T t

T t ξ

=

ξˆt )2]2 which is not available.

However, if we look at the estimate of the number of degrees of freedom for the high-frequency standardised innovations in the cases of t5 disturbances, i.e. ˆv1, it is very close to its true value, 5, across all parameterisations despite the statistically significant but practically trivial positive biases. Given that the unconditional kurtosis of a standardised t distribution with v degrees of freedom is equal to 3( 2)

4 ν ν

, v = 5 corresponds to a kurtosis of 9. Therefore, vˆ1≈5 in Table A.1 implies kˆξH ≈9. For example, ˆv1 is 5.04 under all three parameterisations in Table A.1, which implies kˆξH =3(5.04 2)

(5.04 4)

− = 8.77, much closer to its true value, 9, than is the non-parametric sample estimate of kξH .

In addition to the two points above, there are some other aspects in Table A.1 worth noting:

(i) The level of integratedness, α1H1H, is well estimated under Parameterisation 1 and 2 under both disturbances. Under Parameterisation 3, although α1H + β1H is significantly underestimated, the downward biases under both types of disturbances are only as small as three hundredths of the true value in size. In general, therefore, the ML method is able to correctly estimate the level of integratedness of a strong GARCH(1,1) process. This result will be used as a benchmark, against which the distortion of the estimate of α1H1H in the context of volatility periodicity will be compared.

(ii) The asymptotic innovation variance implied by the parameter estimates,(σˆH)2 =

1 1

ˆ ˆ ˆ

(1 )

H

H H

φ

α β

, is found to be almost equal to its true value across all three parameterisations and the two disturbances. This result signals one of the advantages of using the ML method in GARCH model estimation. But note that this might not be the case when the true t-likelihood is replaced by the Gaussian likelihood under t5 DGPs.

(iii) Finally, the fourth moment of the driving disturbances of the DGP does not seem to affect the fact that the high-frequency parameters of the GARCH(1,1) filter are accurately estimated by the ML method. In fact, the extent to which ˆφH, αˆ1H , βˆ1H , αˆ1H+βˆ1H, and (σˆH)2 are close to their true values is virtually the same across both driving disturbances, as reported in Table A.1. This result might be due to the use of the actual likelihood for both disturbances in the ML method. However, this might not be the case if we have applied the quasi-likelihood method to the t5 cases. In other words, if we assume a Gaussian likelihood for the t5 cases, the parameter estimates might not be as accurate as those under the use of the correct t likelihood for the t5 cases.

Discussions of Table A.2 in Appendix A: low-frequency estimates

(i) The results in Table A.2 tend to suggest that, despite the significant average biases, most of the low-frequency parameter estimates are close to their true values. These accurate estimates justify the usefulness of ML method in estimating the GARCH parameters. However, there appear to be some exceptions arising under Parameterisation 3 for both disturbances, where ˆφL is

(10)

obviously upwardly biased from its true value and βˆ1L (and thus αˆ1L+βˆ1L) downwardly biased from its true value. These biases could be due to the use of relatively smaller α1H under Parameterisation 3, 0.05, compared to 0.15 under Parameterisation 1 and 2. The theoretical value of α1L is thus smaller under Parameterisation 3, i.e. 0.038 under N(0,1) and 0.057 under t5. These two values are dwarfed, respectively, by 0.126 under N(0,1), and 0.173 under t5 for both Parameterisation 1 and 2. Since this fact that a larger downward bias of the beta estimate relates to the use of a smaller alpha in the GARCH(1,1) DGP is also observed in Table A.1 for the high-frequency estimates, we speculate that a smaller alpha in the GARCH(1,1) DGP will render the ML estimation less reliable. One possible explanation for this suggests that it is more difficult for the ML to distinguish the long-term variation form the short-term variation in the conditional innovation variances when the true alpha of the GARCH(1,1) DGP appears to be smaller.

(ii) Turning to the obvious upward bias of ˆφL from its true values under Parameterisation 3 for both disturbances, this result should be discussed along with the estimate of the asymptotic low-frequency innovation variance, i.e. (σˆL)2. It can be seen from Table A.2 that (σˆL)2 is almost equal to its true value irrespective of the parameterisation and disturbance distribution. The accurate estimation of (σL)2 justifies the upward bias of ˆφL under Parameterisation 3, which offsets the downward bias of βˆ1L. It seems to say that the ML method is always able to capture the level of the asymptotic innovation variance of a strong GARCH(1,1) process despite biased parameter estimates. The correct estimation of the asymptotic innovation variance by the ML is also reflected by the fact that the ratio of (σˆL)2 to (σˆH)2 being about 2 to 1 across all parameterisations with both disturbances corresponds exactly to the ratio of (σL)2 to (σH)2, and the aggregation frequency m = 2.

(iii) It is also interesting to note that the estimate for the level of integratedness of the LF GARCH(1,1) model, αˆ1L+βˆ1L, is obviously much more underestimated relative to those for the HF GARCH(1,1) model. This could be caused by the problem of model mis-specification. In particular, the aggregated low-frequency observations are no longer a strong GARCH(1,1) process, but rather a weak GARCH(1,1) process. It is thus not appropriate to estimate the low-frequency observations by a strong GARCH model. However, there are hitherto no approaches to estimating a weak GARCH model. One can only use the ML method to give the

‘best sample-based’ parameterisation of the linear projection of a weak GARCH process. Under such a condition of model mis-specification for the low-frequency observations, it is not surprising to see the larger biases of the parameters of the LF strong GARCH(1,1) model than those of the HF GARCH(1,1) model.

(iv) Finally, it is worth noting that the value of ˆv2 seems to be in inverse proportion to the size of the true α1H. In particular, ˆv2 is 5.254 under both Parameterisations 1 and 2, where α1H is set equal to 0.15. Under Parameterisation 3, where α1H = 0.05, the value of ˆv2 goes up to 6.475.

Despite the lack of theoretical value for v2, ˆv2 given by the strong standard GARCH(1,1) filter might be close to the unknown true v2 , since the same filter successfully estimate its high-frequency counterpart v1. Given the formula for the unconditional kurtosis of a standardised t distribution with v degrees of freedom above, the smaller the true alpha of a high-frequency GARCH(1,1) process, other things equal, the smaller will be the fourth moment of the aggregated process standardised by the aggregated GARCH(1,1) model.

Having analysed the estimation results of the standard GARCH(1,1) model in the no-periodicity benchmark condition, the next section will discuss these same issues but in the

(11)

context of model mis-specification. In other words, we will explore the effects of periodicity in the parameters of the GARCH(1,1) DGP on the estimation of both HF and LF strong standard GARCH(1,1) models.

3.2 Effects of Volatility Periodicity on Model Estimation of the Mis-Specified GARCH(1,1) Filters

To see how the volatility periodicity affects the ML parameter estimates of the mis-specified strong standard GARCH(1,1) models, we devise two types of volatility periodicities through DGPs 1 and 2. The details of both DGPs and the GARCH models used for the conditional variance estimation have been explained in Section 2. Discussions of the simulation results reported in Table A.3 in Appendix A will be categorised according to DGPs 1 and 2.

Since these two DGPs are based upon a two-stage PGARCH(1,1) model, it might be informative to examine the true unconditional variance of each stage of a two-stage PGARCH(1,1) model in (4a) and (4b). In particular, let hts tH( )+α ε1 ( )Hs t t211 ( )Hs t ht1 denote a high-frequency two-stage PGARCH(1,1) model, where s(t) = 1 for odd t and 2 for even t. We have shown in Appendix B that the high-frequency unconditional innovation variance is given by

1 2 11 11

11 11 12 12

( )

1 ( )( )

H H H H

H

odd H H H H

h φ φ α β

α β α β

+ +

= − + + for odd t, and (10a)

2 1 12 12

11 11 12 12

( )

1 ( )( )

H H H H

H

even H H H H

h φ φ α β

α β α β

+ +

= − + + for even t. (10b)

If we aggregate the high-frequency PGARCH(1,1) process given by h = t φ + s tH( ) α ε1 ( )Hs t t21 +

1 ( ) 1 H s t ht

β into a low-frequency process, the aggregation interval being two high-frequency observation periods in length. Under the assumption of i.i.d. high-frequency driving disturbances, the unconditional variance for the aggregated innovations, ελ, is simply the sum of hoddH and hevenH . That is,

hλ= hoddH + hevenH = 1 12 12 2 11 11

11 11 12 12

(1 ) (1 )

1 ( )( )

H H H H H H

H H H H

φ α β φ α β

α β α β

+ + + + +

− + + . (11)

In passing, note that the aggregated low-frequency process, ελ, is covariance stationary as long as 0≤(α11H11H)(α12H12H)<1. We refer to BG (1996) for proof. An implication of this argument is that either α11H11H (the level of integratedness for stage one), or α12H12H (the level of integratedness for stage two) can be larger than 1 as long as their product falls between 0 and 1.

Despite the lack of analytical results10 for the parameters of a PGARCH(1,1) process under temporal aggregation, the formula for hλ in (11) tends to suggest that:

(a) the aggregated process is a weak GARCH(1,1) process, (12a)

(b) following (a), the aggregated intercept φ is equal to (2)L φ1H(1+α12H+β12H) + φ2H(1+α11H+β11H), (12b)

and

(c) following (a),

10 The DN aggregation theory does not apply to a periodic-GARCH (or PGARCH) process, albeit it is possible to extend the theory to a PGARCH version.

(12)

the level of integratedness α +1(2)L β of the aggregated low-frequency weak GARCH(1,1) 1(2)L process is equal to (α11H+β11H) (α12H+β12H). (12c)

Argument (12a) above might be a straightforward result of the fact that we set the length of the aggregation interval equal to the length of each periodic cycle of the high-frequency PGARCH(1,1) DGPs. That is, the aggregated observations do not show any periodicity in the parameters of the GARCH specification governing their conditional innovation variances. We will examine the validity of arguments (12b) and (12c) above via our simulation results for DGPs 1 and 2 as reported in Table A.3 below.

Discussions of Table A.3 in Appendix A:

Results of the cases for DGPs 1 and 2 in Table A.3 tend to suggest that the Monte Carlo standard deviations of the high-frequency parameter estimates and the implied low-frequency parameters are obviously larger than those of the low-frequency parameter estimates. For example, the standard deviation for ˆφH is 0.005, and 0.016 for φ(2)DN , whereas the standard deviation for φ is 0.035 in case 1. This finding might be explained by the number of ˆ(2)L observations used for the model estimation. In particular, 4960 observations are used for the in-sample estimation of the HF standard GARCH(1,1) filter, contrasting with the 2480 observations used for the in-sample estimation of the LF standard GARCH(1,1) filter. Intuitively, the more data are estimated by ML in each replication, the smaller will be the variations of the model parameter estimates across replications. Since the parameters of the aggregated low-frequency weak GARCH(1,1) model are implied by the parameter estimates of the HF strong GARCH(1,1) model which are based on the 4960 observations, it is not surprising to see the smaller standard deviation of φ(2)DN than of φ . Similar results are found in cases 2, 3, and 4, and ˆ(2)L for the estimates of alpha, beta, number of degrees of freedom, unconditional innovation variance implied by the parameter estimates.

a. Impacts on the Average ML Estimated Intercepts of the Strong Standard GARCH(1,1) Model for both HF and LF Observations

(i) ˆφH : In cases 1 and 3, where DGP 1 in (4a) is employed to generate a two-stage periodicity in the intercept φH, the value of ˆφH of the mis-specified HF GARCH(1,1) model is found to be 0.3, a value equal to the average of the true φH of the two stages, 0.05 and 0.01. This result is indicative that the ML method tends to take equal account of the high intercept (and thus the high unconditional innovation variance) in stage 1 and the low intercept (and thus the low unconditional innovation variance) in stage 2, when maximising the likelihood. Since this fact is observed in cases of both 1 (N(0,1) disturbances) and 3 (t5 disturbances), it suggests that the size of the fourth moment of the driving disturbances does not seem to play a role in the ML estimation of the intercept of the mis-specified HF GARCH(1,1) model.

Turning to cases 2 and 4, where DGP 2 in (4b) is used to generate two-stage periodicity in the parameter alpha α1H, the estimated ˆφH is found to be 0.052 in both cases. This estimate is very close to its true values across the two stages, i.e. φ1H = φ2H = φH= 0.05. It appears to suggest that, provided that the intercept is not mis-specified in the standard GARCH(1,1) model, the ML method provides accurate intercept estimate in that mis-specification in the alpha parameter does not cause to bias of the intercept estimate.

(13)

Also, since the PGARCH(1,1) DGPs in (4a) and (4b) consist of Parameterisations 1 and 2 or of Parameterisations 1 and 3 in Table 1, it might be interesting to compare the values of ˆφH in Panel A of Table A.3 to its counterparts in Table A.1 in Appendix A. For example, DGP 1 used in cases 1 and 3 consists of Parameterisations 1 (for stage 1) and 2 (for stage 2) in Table 1 above, we expect to observe the intercept estimate, ˆφH , of the mis-specified GARCH(1,1) filter to be the average of ˆφH under Parameterisation 1, 0.051, and ˆφH under Parameterisation 2, 0.0102.

Indeed, ˆφH is 0.03 in case 1, virtually equal to the average of 0.051 and 0.0102. Similar results can be observed with cases 2, 3, and 4. In line with the argument above, it tends to indicate that the ML method values equally the sizes of the true intercept of each stage of the PGARCH(1,1) DGP when estimating the mis-specified standard GARCH(1,1) filter.

(ii) φ : We have argued above that the aggregated observations might follow a standard ˆ(2)L weak GARCH(1,1) process. In this context, the standard LF strong GARCH(1,1) model might not be regarded mis-specified although the ML method is not tailored for estimating a weak GARCH process. However, DN have documented that the ML estimates of a weak GARCH process are not biased from their true values to a significant extent. As a result, the values of φ , ˆ(2)L αˆ1(2)L , and β ˆ1(2)L are expected to be close to their true values.

In cases 1 and 3 using DGP 1 in (4a), where φ1H = 0.05, φ2H = 0.01, α11H = α12H = 0.15, β11H =

12

βH = 0.7, the formula in (11) gives

(2)

hL = 1 12 12 2 11 11

11 11 12 12

(1 ) (1 )

1 ( )( )

H H H H H H

H H H H

φ α β φ α β

α β α β

+ + + + +

− + + =0.05(1 0.15 0.7) 0.01(1 0.15 0.7)

1 (0.15 0.7)(0.15 0.7)

+ + + + +

− + +

= 0.111

1 0.7225− = 0.4.

The vicinity of the numerator, 0.111, to the estimated φ , 0.117, in cases 1 and 3 reported in ˆ(2)L Panel B of Table A.3 justifies our the argument (12b). That is, the true φ of the aggregated weak (2)L GARCH(1,1) process might well be equal to φ1H(1+α12H12H)+φ2H(1+α11H11H).

To verify the argument in (12b) in the context of cases 2 and 4 using DGP 2 in (4b), where

1

φH = φ2H = 0.05, α11H = 0.15, α12H = 0.05, and β11H = β12H = 0.7. Formula (11) therefore gives hλ

= 0.05(1 0.05 0.7) 0.05(1 0.15 0.7) 1 (0.15 0.7)(0.05 0.7)

+ + + + +

− + + = 0.18

1 0.6375− = 0.49655. Referring to Panel B of Table A.3, the numerator 0.18 above is also close to the estimated φ , 0.195 (case 3) and 0.188 ˆ(2)L (case 4). As a result, the argument in (12b) is more than likely to be true.

(iii) φ(2)DN : The average aggregated intercept implied by ˆφH of the mis-specified standard HF strong GARCH(1,1) model is seen to be close to φ in all four cases discussed above, i.e. cases 1, ˆ(2)L 2, 3, and 4: see Panel B of Table A.3 in Appendix A. Table 3 below compares the values of

(2)

φDN and φ to those of ˆ(2)L φ1H(1+α12H12H)+φ2H(1+α11H11H), denoted φ hereafter, in the four (2)L cases:

Table 3: Comparison of the values of φ and ˆ(2)L φ(2)DN to φ (2)L

Case No (DGP No, Disturbance) φˆ(2)L φ(2)DN (2)

φL =φ1H(1+α12H+β12H)+

2

φH (1+α11H+β11H)

(14)

1 (DGP 1, N(0,1)) 0.117 (0.035) 0.112 (0.016) 0.111 2 (DGP 2, N(0,1)) 0.195 (0.082) 0.186 (0.036) 0.18 3 (DGP 1, t5) 0.117 (0.027) 0.112 (0.016) 0.111

4 (DGP 2, t5) 0.188 (0.059) 0.185 (0.035) 0.18

Obviously, the values of φ(2)DN in all four cases above tend to be close to the values of φ (2)L more than are the values of φ . The implication of this result will be discussed when we move to ˆ(2)L the results of the estimated level of integratedness of the HF GARCH(1,1) filter.

b. Impacts on the Average ML Estimated Dynamics of the Strong Standard GARCH(1,1) Model for both HF and LF Observations

(i) αˆ1H (reported in Panel A) and βˆ1H (reported in Panel B): By similar argument to the discussions of φˆ1H , moreover, we expect the values of αˆ1H (or βˆ1H ) to be equal to the average of the two αˆ1H (or βˆ1H ) under Parameterisation 1 and 2 (for cases 1 and 3) or 1 and 3 (for cases 2 and 4) in Table A.1 in appendix A. In particular, the value of αˆ1H is 0.149 in case 1, which is almost the average of the values of αˆ1H , 0.15 under Parameterisation 1 and 0.15 under Parameterisation 2.

In cases 2, 3, and 4, this is also true.

Similar results are observed with βˆ1H . For instance, βˆ1H is 0.692 in case 4 (under DGP 2, made up of Parameterisation 1 and 3 with t5 disturbances) as reported in Panel A of Table A.3, which is very close to the average of 0.698 under Parameterisation 1 and 0.681 under Parameterisation 3 in Table A.1. The values of βˆ1H in cases 1, 2, and 3 are also equally weighted mixtures of its values under Parameterisation 1 and 2 or 1 and 3, depending on which case is discussed.

The values of αˆ1H and βˆ1H can also be discussed from another perspective by comparing them to their true values. In particular, in cases 1 and 3 using DGP 1, where φ1H = 0.05, φ2H = 0.01,

11

αH = α12H = 0.15, β11H = β12H = 0.7, the value of αˆ1H is 0.149 (in case 1) and 0.15 (in case 3), which is almost equal to or equal to the true α1H( = α11H = α12H = 0.15 ). Moreover, the value of

ˆ1H

β is 0.698 (in case 1) and 0.699 (in case 3), both being almost equal to the true β1H (=β11H12H= 0.7). Taken together, periodicity in the intercept of a PGARCH(1,1) process does not seem to deviate the dynamics estimates of the mis-specified strong standard GARCH(1,1) model from their true values. This result might well be due to the nature of the model mis-specification, that is: the strong standard GARCH(1,1) filter used to estimate a PGARCH(1,1) process of a two-stage periodicity in the intercept fails to correctly specify the intercept of the PGARCH(1,1) process only, but not its dynamics. It is known that the intercept decides only the level of the unconditional variance of a GARCH process, whereas the dynamics parameters characterise how the conditional variance of a GARCH process changes through time. As a result, it is not surprising to observe that the dynamics estimates of the standard GARCH(1,1) filter is the same as their true values despite that the filter does not capture the variation in the intercept of a PGARCH(1,1) DGP.

Turning to cases 2 and 4, where DGP 2 (φ1H = φ2H = 0.05, α11H = 0.15, α12H = 0.05, and β11H =

12

βH = 0.7) is used, the value of αˆ1H is 0.1 (in case 2) and 0.098 (in case 4), which is equal to the

Figure

Updating...

References

Related subjects :