• 沒有找到結果。

Text Analytic Approach on News

3. Research Method

3.3 Text Analytic Approach on News

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

30

3.3 Text Analytic Approach on News

This study uses the linear regression to test the association between different sentiment variables extracted from news and VIXTWN. Text mining technique is used to extract useful information from financial news articles.

News contents are unstructured data. In this study, we try to investigate in what extent that news volume, mood of news and news tone would affect market volatility.

Carretta et al. (2011) suggested the use of a sentiment dictionary to measure the quantity of positive and negative words appeared in the news articles and used the dictionary approach to further analyze the investors’ sentiment. In order to extract useful information from local industrial and global equity market news, we adopt the dictionary approach to score news tone and other sentiment variables.

The test region of our study is the Taiwanese market. Taiwanese investors usually read news articles written in Chinese and their sentiment are mainly impacted by Chinese-written news published by local media. Therefore, we extract news written by Taiwanese press media. We construct a Chinese sentiment dictionary to count positive and negative words and score news tone on news articles available on Taiwanese press media.

The sentiment dictionary that we use in this study is based on the dictionary built by the Innovative and Mobile Financial Service Technologies, Modeling and Applications project supported by the Ministry of Science and Technology. Words included in the dictionary are mostly adjectives describing investors’ physiological status (e.g. pessimistic, optimistic) and other sentiment-related words. In this study, we consider the effect of financial warning on VIXTWN, words related to financial warning and risk are important to our study. We made reference to the sentiment dictionaries proposed by Lin (2013) and Chang (2009), their sentiment dictionaries include words related to financial warning and risk, and those words are added to our sentiment dictionary. Words included in the dictionary are classified into two categories, positive words and negative words. Classifying words into different categories are essential for further analysis. The sentiment dictionary contains thousands of words, the partial positive and negative sentiment dictionary are demonstrated in Appendix 5 and Appendix 6 respectively.

After constructing our sentiment dictionary, we use the text analytic approach to read through all related news collected in the previous step. We measure the frequency of positive words and negative words that appeared on the collected news and count the total number of words in the news articles. According to Li et al. (2014), optimistic (pessimistic) mood can be calculated by dividing number of positive (negative) words by the total number of words appeared in the news article. Ferguson et al. (2012) has also recognized the percentage of positive (negative) content as one of the variables, its calculation formula is the same as that of the optimistic (pessimistic) mood. The formula of optimistic mood and pessimistic mood are shown as follows:

Optimistic mood = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑜𝑟𝑑𝑠

Pessimistic mood = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑜𝑟𝑑𝑠

Apart from optimistic and pessimistic mood, we use news tone to measure investors’ sentiment, we calculate news tone using the tone-formula suggested by Carretta et al. (2011). Zhang and Skiena (2011) identified media polarity as one of the sentiment variables, the formula of calculating media polarity is identical to the tone-formula. Tone is calculated by dividing the difference between number of positive and negative words by the sum of frequencies of positive and negative words, the formula is displayed as follows:

Tone = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠 − 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠 + 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠 The value of tone lies between 1 and -1, positive value will be obtained for news with more positive wordings and negative value will be obtained for news with a more negative tone. Value of tone will be close to 0 for news expressing neutral tone, tone value equals to 1 (-1) means that the particular news pieces express a completely positive (negative) findings.

We measure industrial and market specific sentiment variables. We believe that issues of different industries would impact investors’ sentiment in different extent, as the market size of each industry is different. We calculate weighted average overall sentiment variables by multiplying the industrial sentiment variables with its relative weighting. The weighting that we use is the same as the weighting used by the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX). The weighting that TAIEX used is calculated by dividing the industrial market value by total market value

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

32

of all listed companies. We believe that the impulse brought by industry related news are correlated to its relative market value. As the market value of companies change over time, we use the weighting of TAIEX obtained on 15 May 2018. The detailed weighting of different industries is shown in Appendix 7. As the influence of market movements in different market on Taiwanese investors’ sentiment is hard to quantify, the equity market variables are calculated by averaging variables of different equity market. The sentiment variables that we use in our analysis is the overall industrial and equity market sentiment.

We have mentioned many different types of data that we use in Section 3.2 and 3.3. The data that we use in this study is summarized in Table 3-3. The table listed out the sources of data and their features.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Table 3-3 Summary of Data Sources

Data Period Sources Features Descriptions

List of Major Industries

N/A Standard Industrial Classification of R.O.C.

Name of major industries in Taiwan.

It classified all listed companies into 28 categories.

Financial News 2007 – 2017 CMoney Issue date, news title and body. It consists of 1,170,498 pieces of financial news.

Sentiment Dictionary

N/A Seng (2017), Lin (2013), Chang (2009)

Includes sentiment, financial warning and risk related words.

The study integrated the dictionaries created in these studies.

Macroeconomic Data

2007 – 2017 TEJ VIXTWN, TAIEX, Unemployment

Rate, Inflation Rate, Consumer Confidence Index and Index of Industrial Production.

These are variables that affect the financial market.

Industry Weighting 15 May 2018 TWSE The weighting that TWSE uses to calculate TAIEX.

It is the market value of industry over the market value of all listed companies.

This study applies the multiple regression model to test the hypothesis suggested above and investigate the interaction between market volatility and other different variables. The variables that we use in this study will be listed and described below. We would use VIXTWN to estimate investors’ sentiment and to test all the hypotheses suggested previously.

Baker (2012) suggested that both global and local sentiment were contrarian predictors of returns within markets. Global and local component are important on the investigation on investors’ sentiment. Therefore, we have established four equations to test the effects brought by global and local investors’ sentiment fluctuations. We would use contents of local industries and major equity related news to calculate different investors’ sentiment variables. Equation (1) is used to test hypotheses 1a, 1b and 1c, we examine the linkage between news volumes, optimistic and pessimistic mood of local industrial news and market volatility. Equation (2) is used to examine hypotheses 2a, 2b and 2c, news volumes, optimistic and pessimistic mood of global stock market related news and market volatility are considered in this equation. Equation (3) is used to examine hypotheses 3a, 3b and 3c. Equation (3) is used to test the association between news volumes, news tone of local industrial news and market volatility.

Equation (4) is used to examine hypotheses 4a, 4b and 4c. This equation could analyze the relationship between news volumes, news tone of global stock market news and market volatility. And we will test the delayed impact of news on the financial market by using the sentiment variables data of the previous day and two days before.

The models of the study are shown below:

𝑉𝐼𝑋𝑇𝑊𝑁 = 𝛽0+ 𝛽1𝐼𝑁𝑄𝑊𝑡+ 𝛽2𝐼𝑃𝑡+ 𝛽3𝐼𝑁𝑡+ 𝛽4𝑇𝐴𝐼𝐸𝑋 + 𝛽5𝐶𝐶𝐼 + 𝛽6𝑈𝐸

相關文件