Conclusion & Future Work

Chapter 5 Conclusion & Future Work

In this study, we successfully build a k-days prediction model with TXF. The highest precision is 93.4% with decision tree algorithm to predict the following 100 days up and down trend. After using moving window to simulate real world prediction, we get 79.97% in precision with SVM when we predict the next 20 days close price trend with the past 60 historical trading days.

In the 10-folds cross validation experiment, as we can see from the results, with the increment of k, the precision become larger, which indicates that the longer days we predict, the better performance we get. However, this research is conducted under local environment, the largest value of k is set to 100 considering the calculation time.

Future study can deploy the research into a distributed environment, which can generate data and build models paralleling, to find the largest value of k that can give us the best prediction performance.

In the moving window experiment, the k value is set to 1, 5, 10, 15 and 20 considering we only want to know the trend in short period and among all the days, 20 proves to be the best with the highest accuracy. Future studies can predict longer days as we can see from the results that the longer days we predict, the better result we get. However, if you want to predict future trend within 20 days, we recommend using SVM as the classification algorithm because it showed better performance.

We get high accuracy and precision in this study by studying the relationship between global indices and Taiwan Stock Index Future trend. However, as economic globalization, the influence of global indices is widely spread. If we use these features

‧

to predict other markets like China, whose stock market is controlled heavily by the government, can we still get such high prediction performance? That is an interesting and researchable topic. As far as we know, stock prediction only suitable for open markets that is operated naturally by investors. But it is still a future work that can be studied.

We not only found the relationship between Taiwan and Global stock market, but also compared SVM and Decision Tree algorithm’s performance with different experimental settings in this study:

1. Decision Tree performs better with 10-folds cross validation because the training data is larger compared to the size of sliding window, which is 30 or 60 in experiments 3 and 4, while SVM outperforms Decision Tree with less training data in moving window. This proves that SVM works well with high-dimension and fewer sample data.

2. The long term prediction showed better performance with both two algorithms. It implies that the global market influences Taiwan stock market in the long run.

And it indeed influenced Taiwan market significantly.

3. Decision Tree tends to be more stable than SVM as the prediction days become larger, i.e. k value in this study. It might be because the former is always trying to find the best attribute value as the decision node to partition the data into subsets or classes. However, we conclude the meaningful global indices that influence TXF’s close price most during our experiments, they cover S&P500 Stocks Above 200-Day Average, Seoul Composite, Gold Index, Crude Oil Brent NYMEX, Shenzhen Composite Index, Shenzhen B Stock Index, Platinum Index, U.S. Dollar Index and Shanghai B Stock Index and so on. For more information, please refer to Appendix A.

‧ 國

立政治大學

‧

Na tiona

l Ch engchi University

A lot of researches showed that SVM is the best algorithm when we apply it to the stock prediction area. In this research, we choose Decision Tree because we want to take advantage of its explanatory power. At last, we proved that SVM is not always the best classification algorithm. Sometimes, Decision Tree performs better than SVM as we can see from the experiment results. It depends on how we build our models.

However, Decision Tree has vary applications, in the future, other kinds of trees are a worth trying in the predicting process, for example random forest.

The signal we generated in this study is up and down trends in the next k days, future studies can also combine this result with regression algorithms, which can give us a prediction of the price range of the up or down signal. Combining trends and price range together can give us the best investment strategy considering there are extra fees during making deals in stock market.

‧

Norwegian University of Science and Technology.

2. Campbell, C., & Ying, Y. (2011). Learning with support vector machines.

Synthesis Lectures on Artificial Intelligence and Machine Learning, 5(1), 1-95.

3. Chen, A. S., Leung, M. T., & Daouk, H. (2003). Application of neural networks to an emerging financial market: forecasting and trading the Taiwan Stock Index.

Computers & Operations Research, 30(6), 901-923.

4. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge university press.

5. Lin, S., Patel, S., Duncan, A., & Goodwin, L. (2003). Using decision trees and support vector machines to classify genes by names. In Proceeding of the Europen workshop on data mining and text mining for bioinformatics (pp.

35-41).

6. Lu, Y. C., Fang, H., & Nieh, C. C. (2012). The price impact of foreign institutional herding on large-size stocks in the Taiwan stock market. Review of Quantitative Finance and Accounting, 39(2), 189-208.

7. Mingers, J. (1989). An empirical comparison of selection measures for decision-tree induction. Machine learning, 3(4), 319-342.

8. Ou, P., & Wang, H. (2009). Prediction of stock market index movement by ten data mining techniques. Modern Applied Science, 3(12), p28.

9. Quinlan, J. R. (2014). C4. 5: programs for machine learning. Elsevier.

10. Shen, S., Jiang, H., & Zhang, T. (2012). Stock market forecasting using machine

learning algorithms. url: http://cs229. stanford.

edu/proj2012/ShenJiangZhang-StockMarketForecastingusingMachineLearningA

‧ 國

立政治大學

‧

Na tiona

l Ch engchi University

lgorithms. pdf (visited on 05/08/2015).

11. Wu, M. C., Lin, S. Y., & Lin, C. H. (2006). An effective application of decision tree to stock trading. Expert Systems with Applications, 31(2), 270-274.

12. Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., ... &

Steinberg, D. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1-37.

13. Yang, Y., & Liu, X. (1999, August). A re-examination of text categorization methods. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 42-49).

ACM.

‧

Although we used 479 global indices, we observed all the indices that are used with decision tree when k ∈ {𝑖 × 5|𝑖 ∈ {4,5, ⋯ ,20}}. We found that only 253 of them are used within all the 17 tests. The frequency of each indices to be used is shown in the following table.

16 1 S&P500 Stocks Above 200-Day Average

15 1 Seoul Composite

13 2 Gold Index, Crude Oil Brent NYMEX

11 1 Shenzhen Composite Index

10 2 AMEX New Lows

9 1 5-YEAR TREASURY NOTE

8 4 Shenzhen B Stock Index

7 6 Platinum Index, AEX General-Netherlands

6 12 Powershrs Db G10 Fd Iopv, IBEX35, U.S.

Dollar Index, Shanghai B Stock Index

5 10 Copper Index, US Oil Iopv

4 22

AMEX Japan Index, Kuala Lumpur Composite, Mexico IPC, MSCI China Index, Shanghai Composite Index

3 47 Silver Index

2 56 AMEX Hong Kong 30 Index

1 88 Nikkei 225, FTSE 100 Index, Shanghai A Stock

Index, BSE30-India, Nasdaq Composite

在文檔中運用支持向量機和決策樹預測台指期走勢 - 政大學術集成 (頁 38-43)

‧

‧ 國

立 政 治 大 學

‧

‧

‧ 國

立 政 治 大 學

‧

‧

立政治大學

立政治大學