Chapter 3. Data Description and Research Methodology
3.1. Data and Variable Description
國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
10
3. Data Description and Research Methodology
In this chapter we describe the data, sentiment index, and model that are the basis for our empirical analysis of the impact of sentiment on the relation between index returns and implied volatilities.
3.1. Data and variable description
3.1.1. Sample period and index variable
We obtain the daily data for the S&P 500 stock index, the NASDAQ-100 index, the VIX and the VXN from the Chicago Board Options Exchange (CBOE). The daily data for the S&P 500 and VIX (CBOE Volatility Index) covers the twenty-one-years period from January 1990 to January 2011, a total of 5314 trading days. And the period for the NASDAQ-100 index and VXN (CBOE NASDAQ Volatility Index) is from February 2001 to January 2011, a total of 2512 trading days. The data period for NASDAQ-100 and VXN is shorter because the available VXN data start from February 2001.
The S&P 500 index is published since 1957, and has been widely regarded as the best single gauge of the large cap U.S. equities market. And the index includes 500 large-cap leading companies in leading industries of the U.S. economy. Since the stocks included in the S&P 500 are those of large publicly held companies that trade on the largest American stock market exchanges, it captures 75% coverage of U.S.
equities. Therefore, S&P 500 includes such a significant portion of the total value of the market that it has a good represent of the U.S. equity market.
For the NASDAQ stock market, the NASDAQ index reflects companies across major industry groups including computer hardware and software, telecommunications, retail/wholesale trade and biotechnology. However, it does not contain securities of financial companies including investment companies. Alike S&P
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
11
500, The NASDAQ-100 index includes 100 of the largest domestic and international non-financial securities listed on The NASDAQ Stock Market based on market capitalization.
The Chicago Board Options Exchange’s (CBOE) VIX index is a market implied volatility determined from the bid and ask prices of the S&P 500 index options, and has been considered to be the world's earliest benchmark of market volatility. And VIX is often referred to as the fear index or the fear gauge because it represents one measure of the market's expectation of stock market volatility over the next 30 day period. The original VIX (now the VXO) was constructed using the implied volatilities of OEX (S&P 100) option series so that it represented the implied volatility of a hypothetical at-the-money OEX option with exactly 30 days to expiration. While, in this paper, we use the VIX now has new calculation method.
First, the VIX still measures the market's expectation of 30-day volatility, and it based on the S&P 500 index option prices and incorporates information and captures the volatility skew by using a wider range of strike prices, rather than at-the-money series.
Second, VIX uses a new formula to calculate expected volatility directly from the prices of a weighted strip of options, while the VXO extracted implied volatility from an option-pricing model. Third, unlike VXO uses S&P 100 Index option price, VIX is based on the options of the S&P 500 Index, which is the primary U.S. stock market benchmark, and provides a more precise and representative of market implied volatility.
The other implied volatility used in this study is VXN which related to NASDAQ-100 index. The VXN was introduced since 2001, and it is based on NASDAQ-100 Index option prices. Also, VXN is a key measure of market expectations of near-term volatility, and it measures the market's expectation of 30-day volatility implicit in the prices of near-term NASDAQ-100 options. The
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
12
formula and methodology used to derive VXN is the same as to calculate the VIX.
There are advantages of using implied volatility of index, here VIX and VXN.
As Hibbert et al. (2008) mentioned, the implied volatility is a market-determined volatility from index options, and it avoids the statistical estimation error of calculating realized volatility. Furthermore, implied volatility is a forward-looking volatility that reflects the reaction to the information of market events and the prediction of investors in the future.
3.1.2. Investor sentiment index
In this paper, we want to know whether investor sentiment has impact on the return-volatility relation. The dummy variable of sentiment index used in this study is based on the data provided by Baker and Wurgler and these data are obtained from their website.1
Since there are no definitive or uncontroversial measures of investor sentiment, Baker and Wurgler (2006, 2007) and Baker and Wurgler (2007) form a composite sentiment index based on the first principal component of proxies of investor sentiment. After considered a number of investor sentiment proxies suggested in previous works, they decided to use six of them to form their investor sentiment index.
The six proxies they use are the closed-end fund discount, NYSE share turnover, the number and average first-day returns on IPOs, the equity share in new issues, and the dividend premium. To avoid idiosyncratic and non-sentiment-related components, they use principal components analysis to isolate the common component. Also, to remove business cycle variation from the proxies, they regress each of the six proxies on three business cycle proxies they selected to get the residuals of them as cleaner proxies for investor sentiment. Then, they first estimate the first principal component of the six proxies and their lags to get a first-stage index. Next, they compute the
1 the investor sentiment index data from http://people.stern.nyu.edu/jwurgler/
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
13
correlation between the first-stage index and the six proxies and their lags, and then choose six variables that have higher correlation with the first-stage index. And finally, they use the picked six variables, may be the current or lagged value of the proxies, to estimate the first principal component and form their sentiment index.
In this study, we use a dummy variable as high sentiment variable. The dummy equals one if a month as it beginning is a high sentiment month when the sentiment index of Baker and Wurgler is positive. If the sentiment index is negative at the beginning of the month, then the sentiment dummy would equal zero. For example, the January 2009 sentiment index, as a beginning value of February 2009, is negative, so we classify February 2009 as a low sentiment month. And the dummy for the data in February 2009 are zero as its beginning value of the month is negative.