METHODOLOGY - Empirical Comparison of Common Dynamic Features in Stock Returns between Taiwan a

An interesting feature of asset prices is that “bad” news seems to have a more pronounced effect on volatility than does “good” news. For many stocks, there is a strong negative correlation between the current return and the future volatility. The tendency for volatility to decline when returns rise and to rise when returns fall is often called the (asymmetric) leverage effect. Empirical research has brought forth a

considerable number of stylized facts of high-frequency financial time series;

Bollerslev, Engle and Nelson (1994) give a complete account of these facts. The purpose of this section is to describe some of these characteristic features and the model proposed. Many of researches show that returns on financial assets display erratic behavior, in the sense that large outlying observations occur with rather high-frequency, that large negative returns occur more than large positive ones; these large returns tend to occur in clusters and that periods of high volatility are often preceded by large negative returns. Because of these stylized facts, it seems necessity to consider nonlinear models to describe the observed patterns in such financial time series adequately. It also should be remarked in this case that the maintained hypothesis for high-frequency financial time series is that logarithmic prices of financial assets display random walk-type behavior (Campbell, Lo and MacKinlay, 1997).

Nowadays, models from the GARCH class are the most popular volatility models among practitioners. GARCH models enjoy such popularity because they are capable of describing not only the feature of volatility clustering, but also certain other characteristics of financial time series, such as their pronounced excess kurtosis or fat-tailedness. The standard GARCH model still cannot capture other empirically relevant properties of volatility. Black (1976) attributes that negative shocks or news tends to affect volatility quite differently than positive shocks of equal size. In the standard GARCH model, however, the effect of a shock on volatility depends only on its size. The sign of the shock is irrelevant. Another limitation of the standard GARCH model is that it does not imply that expected returns and volatility are related directly, as is the case in the CAPM, which postulates a direct relationship between the required return on an asset and its risk.

Among several choices of volatility models, we have considered Threshold GARCH (TGARCH) model, which was introduced by the works of Zakoian (1994) and Glosten et al. (1993) as our tool in analyzing the time series data here because of its ability in capturing the effect of fat-tailed distribution, the so-called stylized facts, and the asymmetric shocks, which Kroner and Ng 1998, and Bekaert and Wu 2000 proposed that volatility tends to be higher after a negative return shock than a positive shock of the same magnitude. The TGARCH (1, 1) model assume the form

yt = xtB + εt (1)

εt = ztσt (2)

σ²t = ω + βσ²t-1 + αε²t-1 + γε²t-1dt-1 (3) The equation (1) is the mean equation from the regression with coefficient B (we use the daily log return × 100 as our return metric in the model) and the equation (3) is the variance equation where { zt } is a sequence of independent and identically distributed

random variables with zero mean and unit variance; dt-1 is a dummy variable that dt-1 = 1 if εt-1 is negative, and dt-1 = 0 otherwise. This allows the good and bad news to have different effects on volatility. In a sense, εt-1 = 0 is a threshold such that shocks greater than the threshold have different effects than shocks below the threshold. The intuition behind the TGARCH model is that positive values of εt-1 are associated with a zero value of dt-1. Therefore, if εt-1≥ 0, which implies good news, the effect of an εt-1 shock on σ²t is α. When εt-1 ≤ 0, which implies bad news and dt-1 = 1, the effect of an εt-1

shock on σ²t is (α+γ). If γ >0, negative shocks will have larger effects on volatility than positive shocks. The persistence of shocks to volatility can be given by (α+β+γ/2).

In addition, if the coefficient γ is statistically different from zero, one can conclude that the data contain a threshold effect.

Also, Nelson (1991) introduced the GARCH model which allows for asymmetric effects between positive and negative stock returns, called the exponential GARCH (EGARCH) model. There are no restrictions on the parameters have to be imposed to ensure that σ²t is non-negative and it assumes the leverage effect is exponential rather than quadratic.

Moreover, several academic researches point out that the Generalized Error Distribution (GED) better describes fat-tailed returns of stocks; thus, in our analysis, we assume zt follow a fat-tailed distribution as it can be given by the GED, which has the following probability density function

zt |Ωt-1~ GED(0, σ²t, ν) (4)

f(z) = νexp[-0.5|z/λ|^ν]/[λ2^(1+1/ν)Γ(1/ν)], 0< ν ≤ ∞, -∞ < z < +∞ (5) where ν is the tail-thickness parameter, Γ(·)is the gamma function and

λ = [2^(-2/ν)Γ(1/ν)/ Γ(3/ν)]^0.5, (6)

When n < 2, {zt} is fat-tailed distributed. When n = 2, {zt} is normally distributed.

When n > 2, {zt} is thin-tailed distributed. For detailed example see Tsay (2005, p.

108).

To be able to minimize the kurtosis displayed by financial time series, we fit the TGARCH (1, 1) model parameters by the method of maximum-likelihood estimation (MLE) as stated by Peters (2001), assuming conditional GED distribution to model stock return innovations.

(7) This log-likelihood function is maximized with respect to the unknown parameters to yield the best estimate of the parameters.

2.2 Clustering Models and Cluster Analysis

We apply the concept of a distance measure of the cluster analysis to the financial time series with similar volatility dynamics effects. An important step in most clustering is to select a distance measure, which will determine how the similarity of two elements is calculated. This will influence the shape of the clusters, as some elements may be close to one another according to one distance and farther away according to another. For example, in a 2-dimensional space, the distance between the point (x = 1, y = 0) and the origin (x = 0, y = 0) is always 1 according to the usual norms, but the distance between the point (x = 1, y = 1) and the origin can be 2, or 1 if you take respectively the 1-norm, 2-norm or infinity-norm distance. Accodingly, it is very important to specify which distance measure we use. We use Mahalanobis-like distance, Euclidean distance, and the mixed between the two as our metric in the distance measure. A Mahalanobis-like distance function or sometimes called “quadratic distance” can be defined as:

dTGARCH(x, y) = [(Tx – Ty)′Ω^-1(Tx – Ty)]^0.5 (8)

where Tx = (αx βx γx νx)′ and Ty = (αy βy γy νy) are the vectors of the estimated ARCH, GARCH, leverage effect, and tail-thickness parameters with having Ω = Vx+Vy as a weighting matrix from each stock covariance matrix Vx and Vy. This metric takes into account the correlation between the data and the information about the stochastic dynamic structure of the time series volatilities. This model is very useful for unequal length time series.

Also, we use the Euclidean distance as another metric for comparison. Euclidean distance is the "ordinary" distance between two points that one would measure with a ruler, and is given by the Pythagorean formula. By using this formula as distance, Euclidean space (or even any inner product space) becomes a metric space. The associated norm is called the “Euclidean norm”. It has the following equation:

(9.1) or using the matrix notation,

d(x, y) = [(T_x - T_y)(T_x - T_y)′]^0.5 (9.2) The third metric is the combined Mahalanobis and the Euclidean distance by using the inverse of the sample standard deviation of the corresponding pairwise distances as a weight. This translates higher uncertainty in the estimates with a smaller weight, and

less uncertain ones with a larger weight; thus, we believe that it could increase more power of the metric than the previous methods we used. It takes the following equation:

dCombined(x,y) = w1[(Tx – Ty)′Ω^-1(Tx – Ty)]^0.5 + w2[(Tx - Ty)(Tx - Ty)′]^0.5 (10) where wi , i = 1,2, are weighting parameters. We expect this to be an improved version of the distance metric we consider.

Cluster analysis of time series attempts to identify clusters of data points in a multivariate data set. We also can regard it as data segmentation due to its relation to grouping or segmenting a collection of objects into subsets or clusters. We use the most commonly used clustering method, i.e., the hierarchical clustering. In hierarchical clustering the data are not partitioned into a particular cluster in a single step. Instead, a series of partitions takes place, which may run from a single cluster containing all objects to n clusters each containing a single object. Hierarchical clustering is subdivided into agglomerative methods, in which one starts at the leaves and successively merges clusters together; or divisive methods, in which one starts at the root and recursively splits the clusters. Agglomerative techniques are more commonly used and in this study, we refer the cluster analysis to this method.

Hierarchical clustering may be represented by a two dimensional diagram known as dendrogram which illustrates the fusions or divisions made at each successive stage of analysis. The results of the cluster analysis are shown by a dendrogram, which lists all of the samples and indicates at what level of similarity any two clusters were joined.

The x-axis is a measure of the similarity or distance at which clusters join and different programs use different measures on this axis. Clusters may join pairwise, or individual samples may be sequentially added to an existing cluster. Such sequential joining of individual samples is known as ‘chaining’.

2.3 Multidimensional Scaling

Multidimensional scaling (MDS) can be considered to be an alternative to factor analysis. In general, the goal of the analysis is to detect meaningful underlying dimensions that allow the researcher to explain observed similarities or dissimilarities (distances) between the investigated objects. In factor analysis, the similarities between objects (e.g., variables) are expressed in the correlation matrix. With multidimensional scaling, one can analyze any kind of similarity or dissimilarity matrix, in addition to correlation matrices. In general, MDS attempts to arrange objects (our TGARCH (1,1) model parameters here) in a space with a particular number of dimensions, say, two-dimension, so as to reproduce the observed distances.

As a result, we can explain the distances in terms of underlying dimensions; in our

data structure analysis, we could explain the distances in terms of the two-dimension map. This plot also helps to identify the clusters.

We begin the MDS by first apply the principal component analysis (PCA) for the dimension reduction purpose. This is the important step in generating dimensions for the multidimensional scaling map. In the PCA, all the observed variance is analyzed while it is only the shared variance that is analyzed in the factor analysis. We also use the Matlab software to facilitate us in this step in calculation of eigenvectors to generate the eigenvalues used in generating data dimensions. Then we plot the two-dimensional graph or multidimensional scaling map of the data to see how far the stocks are from each other when we translate the stock return volatilities into the distance term context.

在文檔中 Empirical Comparison of Common Dynamic Features in Stock Returns between Taiwan and Thailand Stock Markets (頁 10-15)