REASERCH METHOD

3. SENTIMENT ANALYSIS MODEL

3.3 REASERCH METHOD

Since the length of the period varies across firm, this study also standardizes the media coverage measure into per month measure to compare to the quantity of news across from the IPO firms. Therefore, this study also counts the number of the news prior to the IPO issue date. Table 3-2 shows that the characteristics of 5876 news items collected on the pre-IPO period. This study divides the news into three types：

1) positive news, 2) neutral news 3) negative news. News that is considered to have a positive (negative) impact on IPO firms. News is considered to have a trivial impact is classified as neutral news. Table 3-2 presents a majority of news have a positive impact on the IPO firms in that positive news show up 91.08%. Moreover, the proportion of positive news accounts for 93.28% one month before the IPO issue date. It is not surprise that all of the three industries has a higher percentage of positive news. As expected, electronics industry related firms cover with 56.11%² of the total news sample.

3.3 REASERCH METHOD

3.3.1 APPROACHES TO MEASURE THE SENTIMENT POLARITY IN NEWS ITEM

This study classifies each news into one of three categories: “positive”, “negative”

or “neutral.” The classification is based on dictionary-based approach to convert the qualitative news to quantitative measures. In the past studies, using a dictionary to compile sentiment words is an obvious approach because most dictionaries (e.g., WordNet Miller et al., 1990) list synonyms and antonyms for each word. Thus, a simple technique in this approach is to use a few seed sentiment words to bootstrap based on the synonym and antonym structure of a dictionary. Specifically, this method works as follows: A small set of sentiment words (seeds) with known positive or negative

2 3297/5876=56.11%

‧

orientations is first collected manually, which is very easy. The algorithm then grows this set by searching in the WordNet or another online dictionary for their synonyms and antonyms. This study also motivated by these studies. Hence, there are three graduate students read the financial news from KMW news database, and this study extracts the most often sentiment word written by the journalists. This study specifies a dictionary which includes the positive and negative sentimental words. However, in this study, this study doesn’t extend our sentiment dictionary by using the dictionaries list synonyms and antonyms for each word. The way of building sentimental lexicons is based on Lin and Chen (2013). In the process of manually extracting the sentiment words from the news items, this study finds that the number of the positive (negative) word is more than negative (positive) word in the positive (negative) news items.

Therefore, to avoid the subjective judgment bias, the computer program scan over the texts of news and calculate the number of the sentimental words in our sentiment dictionary.

In this study, we also use sentiment dictionary-based to classify the news items into positive, neutral, or negative news. In Figure 3-2, this study shows the approach of news sentiment classification. When the number of positive words is more than the number of negative words in each news item, the news item will be classify into positive news. When the number of positive words equal to the number of negative words in each news item and the number of negative words is less than six, the news item will be classify into neutral news. When the number of negative words is more than the number of positive words in each news item and the number of negative words is more than six, the news item will be classify into negative news. Where there are more negative words in news item, it expresses bad signal to the investors. Comparing to the classification by manually and to have a more accurate classification of news, that’s why this study sets a threshold of six negative words to do classification.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

FIGURE 3-2 THE APPROACH OF NEWS SENTIMENT CLASSIFICATION

3.3.2 LYDIA SENTIMENT ANALYSIS

How Lydia sentiment analysis works can be found from (Bautin, Vijayarenu, and Skiena 2008; Godbole, Srinivasaiah, and Skiena 2007). The Lydia sentiment data consists of time series of favorable (positive) and unfavorable (negative) words co-referenced with occurrences of each named entity (here denoting companies). Let p and n denote the number of raw positive and negative references, which occurs a total of N times in the corpus (including neutral references).

Then it can give below derived measures:

• polarity = (p − n)/(p + n)

• subjectivity = (n + p)/N

Polarity indicates percentage of positive sentiment references among total sentiment references, while subjectivity indicates proportion of sentiment to frequency of occurrence. These derived measures could provide additional information that raw references cannot.

‧

construct another standard time window, one month before the issue day. As shown in Table 3-3, therefore, this study forms the different models to test news data between the two different periods. This study also tests our hypothesis under different industries.

At the industry level, the importance of the electronic products sector, which has become the main driver of the Taiwanese economy for a long time is also a key characteristics of IPO firms. In Taiwan, This study thinks the electronics industry has the overwhelming advantage on the cluster effect of media. If one of the electronics supply chain bloom up, the rest of the supply chain would benefit from the trend. Hence, this study also researchs on that the positive sentiment of media coverage about the companies in the electronics industry positively relates to underpricing than those in non-electronics industry.

TABLE 3-3 FRAMEWORK OF THE MODEL

Sample The Pre-IPO Period One Month Prior to The Issue Day

Full Sample

H1-Model (1)、

H1-Model (3)

H1-Model (2) 、H1-Model (4)、

H2-Model (5)

Electronics Industry H1-Model (1) H1-Model (2)、H2-Model (5) Non-Electronics

Industry

H1-Model (1) H1-Model (2)、H2-Model (5)

在文檔中興櫃轉上市櫃公司股票抑價與財經新聞情緒分析之關聯性研究 - 政大學術集成 (頁 44-47)

3. SENTIMENT ANALYSIS MODEL

3.3 REASERCH METHOD

3.3 REASERCH METHOD

3.3.1 APPROACHES TO MEASURE THE SENTIMENT POLARITY IN NEWS ITEM

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

FIGURE 3-2 THE APPROACH OF NEWS SENTIMENT CLASSIFICATION

3.3.2 LYDIA SENTIMENT ANALYSIS

‧

TABLE 3-3 FRAMEWORK OF THE MODEL

Sample The Pre-IPO Period One Month Prior to The Issue Day

立政治大學