Abstract
In the study, a new dynamic fuzzy model is proposed in combination with support vector machine (SVM) to forecast stock market dynamism. In this new integrated model, the fuzzy model integrates various influence factors as the input variables, and the genetic algorithm (GA) adjusts the influential degree of each input variable dynamically.
SVM then serves to predict stock market dynamism in the next phase. In the meanwhile, the multiperiod experiment method is designed to simulate the volatility of stock market.
To evaluate the performance of the new integrated model, we compare it with the traditional forecast methods and design different experiments to testify. From the experiment results, the model from the study does generate better accuracy in forecast than other forecast models.
Keywords: Fuzzy theory, Genetic algorithm, Support vector machine
Acknowledgement
That He would grant you, according to the riches of His glory, to be strengthened with power through His Sprit into the inner man, That Christ may make His home in your hearts through faith, that you, being rooted and grounded in love, May be full of strength to apprehend with all the saints what the breadth and length and height and depth are And to know the knowledge surpassing love of Christ, that you may be filled unto all the fullness of God.
Ephesians 3:16-19
Father, I know that all my life is portioned out for me, And the changes that are sure to come, I do not fear to see;
But I ask Thee for a present mind, intent on pleasing Thee.
There are briers besetting every path that call for patient care;
There is a cross in every lot, And an earnest need for prayer;
But a lowly heart that leans on Thee is happy anywhere.
In a service which Thy will appoints, there are no bonds for me;
For my inmost heart is taught “the truth” that makes Thy children “free”. And a life of self renouncing love is a life of liberty.
Ping-Jie Chen Intelligent Management System Lab.
Chuang Hua University Hsinchu, Taiwan, R.O.C.
June, 2006
Content
Abstract………i
Acknowledgement……….……….…….. ii
Content………..………...…………. iii
List of Figures……….……...…….……. iv
List of Tables………..…..………….….. v
Chapter One Introduction... 1
Chapter Two Literature Review………... 3
2.1 Stock Market Modeling... ... ... ... ... 3
2.2Fuzzy Theory... 6
2.3 Genetic Algorithm.... .... .... .... .... .... .... .... .... .... ...6
2.4 Support Vector Machine .... .... .... .... .... .... .... .... ...7
Chapter Three Methedology………... 8
3.1 Architecture... 8
3.2 Dynamic Fuzzy Model for Influence Factors...9
3.3 Model Optimization with Genetic Algorithm...10
3.4 Prediction of Stock Market Dynamism with SVM...11
Chapter Four Experiment…………... 13
4.1 Data Preparation... 13
4.2 Experimental Design... 15
4.3 Experimental Data... 17
4.4 Technical Indicators...22
4.5 Experimental Result...31
Chapter Five Conclusion…………... 36
References.……….... 37
Appendix A Introduction to Fuzzy Theory………. 40
Appendix B Introduction to Genetic Algorithm………. 45
List of Figures
Figure 1. The architecture of integrated forecasting model...8 Figure 2. Initial membership function for the influential
degree of factor GDP.... .... .... .... .... .... .... .... .... ....10 Figure 3. First-order Difference Daily Prices of Taiwan Stock Market (from 20 January 2004 to 31 December
2005)...14 Figure 4. Strategy for multiperiod stock market movement
forecast...15 Figure 5. Strategy for twoiperiod stock market movement
forecast...15
Figure 6. Adjusted membership function for influence factor
GDP.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..16
Figure 7. Comparison between proposed model and SVM with
fuzzy model under multiperiod method.. ... ... ... ….34
Figure 8. Comparison among each model under twoperiod
method and integrated various influence factors…..34
Figure 9. Comparison between each model under twoperiod
method and single kind of influence factors….…..35
List of Tables
Table 1. Past researches in stock market forecast...4 Table 2. Original data of stock market...18 Table 3. Original data of future market...19 Table 4. Part of technical indicators used in this
study...20 Table 5. Part of integrated input variables used in this
study...21 Table 6. Part of forecast result………...22 Table 7. Forecasting performance of different input variables
with various forecast models...33
Chapter One Introduction
Stock market is a complicated and volatile system due to too many possible potential models and influential factors. In the past studies, as a result, dynamism in the stock market was often considered as random movement. Nevertheless, according to the researches in the recent years, it is not entirely random. Instead, it is highly complicated and volatile [1]. Many factors, including macroeconomic variables and stock market technical indicators, have been proven to have a certain level of forecast capability on stock market during a certain period of time [2]. For example, the technical indicator that shows market volume or confidence proves to have forecast value on the future transaction prices [3]. In the past decade, various methods have been widely applied in the stock market forecast such as linear and nonlinear mathematical model or multi-agent mechanism [4][5] to simulate the potential stock market transaction mechanism. Because of the advantages of arbitrary function approximation, needless of statistics assumption and incorrect input variable limit, ANN is widely applied in the simulation market potential transaction mechanism [6] [7]. Support Vector Machine (SVM) is a newly developed mathematical model with outstanding performances in handling high dimension entry space problems. Such a feature leads to a better performance of SVM in simulation market potential transaction mechanism than other methods. To improve the forecast performance, some machine learning methods are combined with simulation transaction mechanism. For example, genetic algorithm (GA) (often applied in selecting model input variable) is used to reduce input feature dimension and select better model parameters [8] to increase the forecast accuracy.
An increasing number of researches adopt SVM to predict stock market dynamism and produce significant performances. Huang [9] adopts SVM to predict stock market dynamism with 9 macroeconomic factors. Yu [10] utilizes SVM and GA to predict stock market dynamism with dynamically selecting 18 stock market technical indicators.
However, few researches integrate the input variables into high dimension input space to fully develop the advantages of SVM. Meanwhile, most research experiments simply adopt two-period method, divides the data into in-sample to obtain model parameter and out-of-sample to testify model accuracy.
Although the numerous related researches bring highly remarkable achievements, none of the models can continuously and successfully predict the dynamism of stock market [11]. The main reasons of this are the high volatility, un-coordinate of various input variables and dynamic input variables selection. The volatility of the stock market makes factors influencing stock market change with time. An optimized forecast model is unable to guarantee to have the same forecast performance even after successful forecast of stock market dynamism during a certain period of time. In the selection of input variables in forecast model, too few input variables will lead to inability to predict market mechanism due to insufficient factors and reduction of forecast accuracy. Too many input variables of forecast model will, however, bring too many noises and cause overfitting.
Moreover, due to different features of influence factors, further researches on integrating various input variables is required to study the relationship between each factor and stock market dynamism.
To solve the above problems, the new dynamic fuzzy model integrated with SVM is used in this study to predict the stock market dynamism. The fuzzy model serves to integrate different input variables, while GA and multiperiod experiment handle the stock market volatility. SVM also takes care of high dimension input space without causing overfitting to predict stock market dynamism [12]. The fuzzy model, adjustable with time, is first used to consider influence factors with different features such as macroeconomic factors, stock and futures technical indicators. GA locates the optimal parameters of fuzzy model for each influence factor changing with the time. Multiperiod experiment divides data into many sections to train the forecast model with the earlier section data to predict the latter section data in order to simulate the stock market volatility. With the newly found influential degree of input variables, SVM predicts the stock market movement dynamism in the next phase.
Other architecture of the study is: Chapter 2 will review past study; Chapter 3 will provide detailed introduction of each part of the integrated forecast model; Chapter 4 testifies the forecast performance of the integrated architecture with stock market in Taiwan. The last chapter offers the conclusions and contribution of the study.
Chapter Two Literature Review
We will introduce past studies around the stock market forecasting in this chapter.
2.1 Stock Market Modeling
The financial market is a complex, evolutionary, and non-linear dynamical system. The field of financial forecasting is characterized by data intensity, noise, non-stationary, unstructured nature, high degree of uncertainty, and hidden relationships. Many factors interact in finance including political events, general economic conditions, and trader’s expectations. Therefore, predicting finance market price movements is quite difficult.
Increasingly, according to academic investigations, movements in market prices are not random. Rather, they behave in a highly non-linear, dynamic manner. The standard random walk assumption of futures prices may merely be a veil of randomness that shrouds a noisy non-linear process.
Conventional time series are usually identified adopting linear or nonlinear global models, whose underlying process is considered stationary. Unfortunately, financial time series can hardly be construed as stationary. Although some form of non-stationarity has to be taken into account, there is nothing to prevent a single global model from being used to capture the whole process dynamics. However, the behavior of financial time series can be better investigated assuming that the underlying process exhibits the so-called piecewise stationarity, or multi-stationarity. In this case one assumes that local stationarities (i.e. different regimes) hold and a rapid regime shift typically occurs as the result of some exogenous event. In the time series community, several methods have been proposed for identifying regimes in financial time series.
Relevant approaches devised to address this issue include the followings. Tong and Lim introduced the Threshold AutoRegressive (TAR) model, a piecewise linear process whose central idea is to change the parameters of a linear AR model according to the value of an observable variable, called the threshold variable. Thus, different regimes are identified by several sets of parameters whose activation depends on the current value of the threshold variable. Moreover, if this variable is a lagged value of the time series, the
model is called a Self-Exciting TAR (SETAR). Friedman devised the Multivariate Adaptive Regression Splines (MARS) model, which is able to represent process dynamics in the form of product spline basis functions. Regime adaptation is automatically performed by suitably updating the number of splines and their parameters (product degree and knot locations), according to the current data. Lewis used MARS models to predict financial data, as they are able to identify the underlying dynamics better than classical AR models.
Artificial Neural Networks (ANN) has been widely used for stock market prediction.
ANN can identify highly nonlinear models, have effective learning algorithms, can handle noisy data, and can use inputs of different kinds
Although the numerous related researches bring highly remarkable achievements, none of the models can continuously and successfully predict the dynamism of stock market [11]. The main reasons of this are the high volatility, un-coordinate of various input variables and dynamic input variables selection. Table 1 shows the summary of past researches in stock market forecast.
Table 1. Past researches in stock market movement forecast
Researcher Algorithm Year Goal Conclusion
Lo et al. Nonparametric
Regression 2000
Evaluate the relationship
between technical
indicators and stock market
Technical
indicators can predict the stock market
movement.
Stock market is not efficient.
Black et al.
Linear regression, Nonparametric Regression
2004
Evaluate the relationship
between various stocks and economics
activity
Financial and macroeconomic variables can predict stock market return.
Kou et al. Fuzzy Delphi, 1998 Apply machine Machine
ANN and GA learning
algorithm to evaluate the relationship
between both quantitative and qualitative
factors and stock market movement
learning
algorithm can use quantitative and qualitative factors to predict stock market movement
Armano et al.
XCS
architecture, ANN
2002
Apply agent techniques to evaluate the relationship
between technical
indicators and stock market
Agent
techniques can use technical indicators to predict stock market
movement
Huang et al. SVM, GA 2004
Apply SVM to forecast stock market
movement
SVM is suitable in forecast stock market
movement
Yu et al. SVM, GA 2005
Apply SVM to evaluate the relationship
between both technical
indicators and macroeconomics variables and market move
Both technical and
macroeconomics variables can forecast stock market
movement
2.2 Fuzzy Theory
The theory of fuzzy sets was first introduced by Loti Zadeh (1965), primarily in the context of his interest in the analysis of complex systems. However, some of the key ideas of the theory were envisioned by Max Black, a philosopher, almost 30 years prior to Zadeh’s seminal paper (Black, 1937). Basically, the concept of the fuzzy set is a generalization of the classical or crisp set.
The crisp set is defined in such a way as to dichotomize the individuals in some given universe of discourse into two groups: members (those that certainly belong in the set) and nonmembers (those that certainly do not). A sharp, unambiguous distinction exists between the members and nonmembers of the class or category represented by the crisp set. However, many of the collections and categories do not have this kind of characteristic. Instead, their boundaries seem vague and the transition from member to nonmember appears gradual rather than abrupt. Thus, the fuzzy set introduces vagueness by eliminating the sharp boundary dividing members of the class from nonmembers.
Fuzzy theory provides the forms for representing uncertainties. Historically, probability theory has been the primary tool for representing uncertainties that assumed to follow the characteristic of random in mathematical models. However, not all uncertainties are random. Some forms may be deterministic and not suitable for treating or modeling by probability theory. Fuzzy theory is a tool for modeling these kinds of uncertainties.
2.3 Genetic Algorithm
Genetic Algorithms are a family of computational models inspired by natural evolution. In a broader usage of the term, a genetic algorithm is any population-based model that uses selection and recombination operators to generate new sample points in a search space. An implementation of a genetic algorithm deals with a population of chromosomes, each representing a potential solution, in form of a ternary string, to a target problem. Usually, these chromosomes are randomly created, and undergo reproductive opportunities in such a way that better solutions are given more chances to reproduce than poorer ones. Although GA have been adopted in a multitude of different tasks, in this paper we are concerned with proposals that address the problem of financial
time-series forecasting.
Noever and Baskaran[13] investigated the problem of predicting trends and prices in financial time series, conducting experiments on the S&P500 stock market. Mahfoud and Mani[14] addressed the general issue of predicting future performances of individual stocks. Their works are particularly relevant, as they compare GA and ANN applied to financial forecasting. According to their experiments repeated on several stock markets, both approaches outperform the B&H strategy. A combined approach, obtained by averaging out GA and ANN outputs, is also experimented with positive results.
GA have also been used in a variety of hybrid approaches to financial time series prediction. For example, Muhammad and King[15] devised neural fuzzy networks to forecast the foreign exchange market, whereas Kai and Wenhua[16] exploited GA to train ANN for predicting a stock price index.
2.4 Support Vector Machine
Support vector machine (SVM) is a very specific type of learning algorithms characterized by the capacity control of the decision function, the use of the kernel functions and the sparsity of the solution [17][18]. Established on the unique theory of the structural risk minimization principle to estimate a function by minimizing an upper bound of the generalization error, SVM is shown to be very resistant to the overfitting problem, eventually achieving a high generalization performance. Another key property of SVM is that training SVM is equivalent to solving a linearly constrained quadratic programming problem so that the solution of SVM is always unique and globally optimal, unlike neural networks training which requires nonlinear optimization with the danger of getting stuck at local minima.
Some applications of SVM to financial forecasting problems have been reported recently [19–23]. In most cases, the degree of accuracy and the acceptability of certain forecasts are measured by the estimate’s deviations from the observed values.
Chapter Three Methodology
Chapter 3 will provide detailed introduction to each part of the integrated forecast model.
3.1 Architecture
In the study, we use fuzzy theory to coordinate the macroeconomic variable, stock market technical indicators and futures indexes. GA serves to dynamically adjust the fuzzy model parameters of each factor to determine the influential degree. Test is conducted with SVM to locate the approximate optimal parameters of fuzzy model. The integrated architecture is shown in Figure 1.
Fig.1 The architecture of integrated forecasting model
In the initialization stage, the integrated model randomly generates the parameters needed by fuzzy model for each factor. GA adjusts the influential degrees of factors, and SVM is used to testify the new adjustment of each factor during the training period. With
the accuracy of the training period, GA determines whether to conduct evolution or whether target is reached. In the event of evolution, after selection, crossover and mutation, parameters of new fuzzy model are generated. When the termination condition is reached, SVM then uses the new calculated influential degree to forecast the stock market dynamism.
3.2 Dynamic Fuzzy Model for Influence Factors
To increase the forecast accuracy, the input variable {xk} of the forecast model shall cover sufficient influence factors and have to precisely reflect the influential degree of each influence factor that changes with time. Due to difference of each kinds of factor, in many studies, relationship between different kinds of factors and stock market dynamism are discussed separately. To consider three kinds of factors that affect stock market—technical indicators, macroeconomic variables and futures at the same time, and to adjust the influential degree with time, we propose the dynamically adjustable fuzzy model to solve the issue. Each influential degree of input variable (ID) is determined by the adjusted scaled index (SI) and the performances of the variable changed with time μA(t) as shown in Formula 1.
k,t k,t A
ID =SI * t (1)
Due to the different field ranges of input variables, to increase the accuracy, we adopt linear transference to adjust the variable to the range of [-1, 1] as shown in Formula 2.
k,t k
k,t
k k
x - min x SI =2 -1
max x - min x (2)
In the dynamically adjusted fuzzy model in the study, to reflect the changes of variables affecting the stock market with time, in the condition of considering the complexity of calculation and actual improvement of forecast accuracy, we compare the trapezoid, triangle and Gaussian membership functions and adopt the trapezoid membership function to simulate the changes. We adjust the membership function μA(t) of time (t) on (A) as shown in Formula 3.
1
1
1 1
A
1
1 1
1
0 for t < a t - a
for a t a a - a
t = 1 for a t b b - t
for b < t b b - b
0 for t > b
(3)
Each performance variable has its independent fuzzy model. Each model is determined by the four values a1, a, b, b1, which will dynamically adjust in accordance with the fitness function from GA to properly express the influence of the variable.
For example, GDP (gross domestic product) is a macroeconomic indicator. As it is published once a year, the maximum value of x in the initialization fuzzy model is 360.
The initial a1, a, b, b1 are generated randomly and then adjusted in accordance with the fitness value generated by GA as shown in Figure 2.
Fig.2. Initial membership function for the influential degree of factor GDP
3.3 Model Optimization with Genetic Algorithm
GA is an efficient and better search method in the broad sense. With the simulation of biological evolution phenomenon, the parameter with higher fitness function is left.
Also, with mechanisms of crossover and mutation, etc, issue of partial minimization during search is avoided and search time is shortened. Due to the nature of stock market, factor influence changes with time and the model has to be dynamically adjusted. With GA, we can locate the approximate optimal solution and the better parameters of model in a certain period of time.
In the initialization stage, the chromosome in the experiment is in real-coded. Each input variable has a1, a, b, b1 and the range. a1, a, b, b1 determine the shape of the fuzzy model, which represents the changes of the influential degree of each variable. The range represents the published cycle of the variable. Due to the difference of each problem, GA is unable to obtain the same search results with fixed detailed setup. Taking time and precision into consideration, crossover in the experiment adopts two-point crossover with the probability at 0.8 and mutation rate at 0.02. The forecast accuracy in the study is the criteria to evaluate the forecast model. Therefore, design of evaluation (fitness) function is the optimization of accuracy of the forecast model in the training period. GA locates the approximate optimal solution. Within 100th generations with change ratio below 0.02, the evolution stops. The located parameter solution shall serve to establish the better model of all variables in a certain period of time.
3.4 Prediction of Stock Market Dynamism with SVM
SVM defines the input variable supposition space with linear function and introduces the learning deviation to learn the mapping between input and output. As the linear fitting machine is operated in the feature space of the kernel function for learning, when the applied field has high dimension feature space, SVM shall effectively avoid overfitting problem and pose excellent learning performance [12].
In each sub generation, after GA determines the parameters of each fuzzy model, SVM will serve to locate the relationship between influence variables (e.g. stock market technical indicators, macroeconomic variables and futures technical indicators) and stock market dynamism. In this study, SVM in classification is employed. The main concept is to transfer the mapping of input space kernel to high dimension feature space before re-classification. To begin, SVM selects several support vectors from the training data to represent the entire data. In this study, the issue can be expressed as:
Provided the existing training datum: (x1, y1),…(xp, yp), x ∈Rn, y ∈{1,-1},
p is the number of data and n is the dimension of stock market influence variables.
When y equals to 1, the stock market goes up; when y equals to -1, the stock market goes down. In the linear analysis, in an optimal hyperplane, (w•x)+b=0 can completely separate the sample into two conditions:
w x +bg 0 y =+1,i
w x +bg 0 y =-1,i
In the linear separation, it is a typical quadratic programming problem. Lagrange formula below can be applied to find the solution.
p 2
i i i
i=1
L w,b,a =1 w - y w x +b -1 , 2 g
In the linear analysis, the original problem can be considered as a dual problem. To find the optimal solution, the approach is:
p p p
i i j i j i j
i=1 i=1 j=1
max W = -1 y y x x . 2 g
Constraint:
p
i i i
i=1
y =0, >0, i=1,2,L ,p.
By solving the quadratic programming, the classification formula applied to forecast the dynamism of next day stock market can be obtained as shown below
p
i i i i i
i=1
f x =Sign y x xg +y -w xg (4)
Any functions that meet Mercer’s condition can be kernel functions. We adopt radial kernel as kernel function of SVM [9].
1 2
K s,t = exp - s-t
10 (5)
Chapter Four Experiment
In this chapter, we will introduce data preparation, experimental design, and experimental results.
4.1 Data Preparation
We integrate fuzzy theory, GA and SVM to forecast stock market dynamism, targeting the stock market in Taiwan. The input variables in the study includes a total of 61 variables including technical indicators in stock market and futures market and the macroeconomic indicators in Taiwan [24][25]. The influence factors for both stock and future market include On balance volume(OBV), Demand index(DI), Momentum(MTM), Relative strength index(RSI), Moving average convergence and divergence(MACD), Total amount per weighted stock price index(TAPI), Psychological line(PSY), Advance decline ratio(ADR), Williams(WMS), BIAS, Oscillator(OSC), Moving average(MA), K line(K), D line(D), Perform criteria(PC), Autoregressive(AR), Different(DIF), Consistency ratio(CR), Relative strength volume(RSV) and Exponential Moving Average(EMA). The formula and explain of technical indicators as shown in appendix E.
Macroeconomic indicators include Annual Change in Wholesale Price Index(WPC), Annual Change in Export Price Index(EPC), Annual Change in Industrial Production Index(IPC), Annual Change in Employees on Payrolls(EMPC), Gross national production(GNP), Approved Outward Investment by Industry(AOI), Gross domestic product(GDP), Import by Key Trading Partners(IKT), Export by Key Trading Partners(EKT), Long term interest rate(LT), Consumer price index(CPI), Government consumption(GC), MFGs' New Orders(MNO), Average Monthly Working Hours(AMWH), Average Monthly Wages and Salaries(AMWS), Bank Clearings(BC), Manufacturing Sales(MS), Quantum of Domestic Traffic(QDT), Monetary Aggregate(M1B), Term architecture of interest rate(TS) and Short term interest rate(ST).
The original data of stock and futures market in Taiwan are retrieved from Taiwan Stock Exchange Corporation while macroeconomic indicators are from Ministry of Economic Affairs, R.O.C. The historical data are for two years from January 2003 to December 2004 for a total of 714 pieces of data. Among which, 378 pieces go up while
the rest go down. Figure 3 shows the stock price between 0 and 1 in normalization.
Fig 3. First-order Difference Daily Prices of Taiwan Stock Market (from 20 January 2004 to 31 December 2005)
4.2 Experimental Design
As stock market is a complicated and volatile system, to express the changes of influential degree of each factor effectively, we divide and compare with two methods—multiperiod and two-period. In the multiperiod method, data of every fifty days serve as one set of training data to obtain parameters of the integrated model.
Forecast of the next day is made with such data as shown in Figure 4. In the two-period method, all the data are divided into the halves. One half serves as training while the other half is for verification as shown in Figure 5.
Fig. 4. Strategy for multiperiod stock market movement forecast
Fig. 5. Strategy for twoperiod stock market movement forecast
Each indicator has its independent fuzzy model. Based on the accuracy during past year, GA simulates the changes of influential degree. From the experiment results, the dynamic fuzzy model almost does not change after 500th generations.
Take GDP as an example. From 2004/1 to 2005/1, the adjusted dynamic fuzzy model is shown in Figure.6. In GDP, day a1 represents the beginning of GDP influential degree while day a refers to the climax of the influence. Day a does not start from the beginning
and it takes 30 days from μt =0 toμt =1. It means that the influence of factor GDP does not reflect immediately and increase gradually. Meanwhile, it takes as long as 245 days from day a to day b. It means that the influence of GDP in the market can last a period of time. Dynamic fuzzy models located by GA differ in different periods. This is resulted from nature of stock market changes leading to the same variable having different influence in the different periods.
Fig. 6. Adjusted membership function for influence factor GDP
During the process of adjusting each fuzzy model, SVM is used to evaluate the quality of each fuzzy model. The output of SVM is accuracy used as value of fitness function in GA. After dynamic fuzzy model is first used as preprocessor for influential degree adjustment. Then influential degrees of each factor are sent to SVM as input for forecasting.
4.3 Experimental Data
The original data of stock and futures market in Taiwan are retrieved from Taiwan Stock Exchange Corporation while macroeconomic indicators are from Ministry of Economic Affairs, R.O.C. The historical data are for two years from January 2003 to December 2004 for a total of 714 pieces of data. Among which, 378 pieces go up while the rest go down.
Original data of stock market retrieved from Taiwan Stock Exchange Corporation include transaction date, open index, close index, highest index, lowest index, volume in Taiwan Dollar and sheet of volume. Part of original data is shown as below.
Table 2. Original data of stock market
Date Open
index
Highest index
Lowest index
Close index
Volume in Taiwan
Dollar(million)
Sheet of volume
20031226 5,873.40 5,881.33 5,835.99 5,857.21 45,118 2,543,756 20031229 5,870.08 5,874.20 5,804.89 5,804.89 45,395 2,631,362 20031230 5,845.17 5,886.79 5,813.34 5,866.75 80,092 3,558,291 20031231 5,872.45 5,900.82 5,868.74 5,890.69 53,118 2,726,222 20040102 5,907.15 6,043.30 5,907.15 6,041.56 112,855 5,181,757 20040105 6,080.18 6,137.91 6,061.86 6,125.42 136,431 6,700,556 20040106 6,170.02 6,170.02 6,110.69 6,144.01 127,254 6,218,095 20040107 6,174.95 6,215.45 6,130.35 6,141.25 132,431 6,299,045 20040108 6,180.36 6,189.60 6,142.24 6,169.17 105,663 4,991,855 20040109 6,241.36 6,257.89 6,207.69 6,226.98 147,887 6,148,583 20040112 6,225.23 6,246.62 6,196.41 6,219.71 91,498 3,755,365 20040113 6,239.16 6,255.94 6,195.64 6,210.22 90,301 3,882,134 20040114 6,195.09 6,298.72 6,190.35 6,274.97 113,183 5,035,286 20040115 6,297.59 6,306.95 6,253.64 6,264.37 112,592 5,208,642 20040116 6,300.76 6,311.46 6,264.04 6,269.71 104,206 4,851,750 20040127 6,371.50 6,399.82 6,334.82 6,384.63 140,611 6,122,825 20040128 6,360.71 6,421.45 6,335.54 6,386.25 152,288 7,057,419 20040129 6,348.93 6,379.05 6,303.21 6,312.65 152,651 6,824,281 20040130 6,331.71 6,386.13 6,329.17 6,375.38 126,363 5,904,909
Original data of future market include transaction date, backwardation month, highest price, lowest price, open price, close price, turnover, balance price and un-write-off contract. Part of original data is shown as below. Due to the feature of future market, one transaction date will have four items to trade according the backwardation month.
Considering the turning, in the study only the nearest two months will be used as original data.
Table 3. Original data of future market Data backwardation
month
highest price
lowest price
open price
close
price turnoverbalance
price un-write-off 20031229 200402 5887 5828 5887 5840 267 5840 722
20031230 200401 5909 5847 5870 5890 31899 5890 38488 20031230 200402 5910 5849 5863 5888 332 5888 874 20031231 200401 5917 5883 5899 5903 16173 5903 38442 20031231 200402 5915 5882 5891 5905 222 5905 1009 20040102 200401 6072 5897 5900 6056 40341 6056 37498 20040102 200402 6070 5895 5921 6066 438 6066 1234 20040105 200401 6154 6060 6074 6144 32441 6144 38817 20040105 200402 6153 6063 6089 6145 921 6145 1890 20040106 200401 6178 6128 6172 6160 27231 6160 37759 20040106 200402 6188 6128 6188 6160 1195 6160 2793 20040107 200401 6252 6160 6163 6174 44226 6174 39454 20040107 200402 6249 6163 6168 6178 1367 6178 3624 20040108 200401 6225 6175 6188 6211 26881 6211 40165 20040108 200402 6222 6180 6183 6213 1008 6213 4291 20040109 200401 6299 6242 6278 6270 32733 6270 37680 20040109 200402 6295 6243 6243 6270 2067 6270 5650 20040112 200401 6280 6241 6258 6275 16269 6275 37201 20040112 200402 6280 6244 6270 6272 2217 6272 7455 20040113 200401 6289 6221 6288 6254 24588 6254 33727 20040113 200402 6310 6228 6310 6254 5778 6254 12307 20040114 200401 6345 6225 6240 6325 33201 6325 27288 20040114 200402 6348 6230 6240 6339 10055 6339 17337 20040115 200401 6334 6288 6312 6300 18597 6300 22903 20040115 200402 6347 6307 6337 6320 10926 6320 23786 20040116 200401 6344 6285 6329 6285 17175 6285 17414 20040116 200402 6369 6321 6345 6322 15136 6322 30531 20040127 200401 6387 6325 6325 6370 15780 6370 10450 20040127 200402 6398 6362 6372 6384 19537 6384 36849 20040128 200402 6398 6351 6370 6367 24206 6367 39626 20040128 200403 6400 6354 6380 6375 331 6375 501 20040129 200402 6344 6302 6332 6331 29565 6331 39663 20040129 200403 6365 6316 6365 6333 354 6333 687 20040130 200402 6387 6325 6330 6381 24340 6381 38574 20040130 200403 6388 6325 6325 6388 220 6388 699
For the purpose of this study, the original data can turn to technical indicators. Part of technical indicators is shown as below.
Table 4. Part of technical indicators used in this study
stock_day OBV_AVG DI EMA_12 EMA_26 DIF MACD_9 RS_6
20031226 18525782.83 5857.94 5845.78 5861.74 -15.96 -21.76 0.16 20031229 18094523 5822.22 5842.16 5858.82 -16.66 -20.74 0.16 20031230 18610327.42 5858.41 5844.66 5858.79 -14.13 -19.42 0.16 20031231 18540988.33 5887.74 5851.28 5860.93 -9.65 -17.46 0.16 20040102 19199986.58 6008.39 5875.45 5871.85 3.6 -13.25 0.17 20040105 20190179.33 6112.65 5911.95 5889.69 22.26 -6.15 0.17 20040106 20149974.25 6142.18 5947.37 5908.39 38.97 2.87 0.17 20040107 20143228.42 6157.08 5979.63 5926.81 52.82 12.86 0.17 20040108 20034295.92 6167.55 6008.54 5944.65 63.89 23.07 0.17 20040109 20130689.92 6229.89 6042.59 5965.77 76.82 33.82 0.17 20040112 19305360.92 6220.61 6069.98 5984.65 85.33 44.12 0.17 20040113 19294796.83 6218.01 6092.75 6001.94 90.82 53.46 0.17 20040114 19390892.83 6259.75 6118.45 6021.03 97.41 62.25 0.17 20040115 19376446.5 6272.33 6142.12 6039.65 102.47 70.29 0.17 20040116 19346705.5 6278.73 6163.14 6057.36 105.78 77.39 0.17 20040127 19452628.42 6375.98 6195.88 6080.96 114.92 84.9 0.17 20040128 19530511.25 6382.37 6224.57 6103.29 121.29 92.18 0.17 20040129 18373702.92 6326.89 6240.31 6119.85 120.46 97.83 0.17 20040130 18297088.58 6366.52 6259.73 6138.12 121.61 102.59 0.17
Macroeconomic variable are retrieved from Ministry of Economic Affairs include Annual Change in Wholesale Price Index(WPC), Annual Change in Export Price Index(EPC), Annual Change in Industrial Production Index(IPC), Annual Change in Employees on Payrolls(EMPC), Gross national production(GNP), Approved Outward Investment by Industry(AOI), Gross domestic product(GDP), Import by Key Trading Partners(IKT), Export by Key Trading Partners(EKT), Long term interest rate(LT), Consumer price index(CPI), Government consumption(GC), MFGs' New Orders(MNO), Average Monthly Working Hours(AMWH), Average Monthly Wages and Salaries(AMWS), Bank Clearings(BC), Manufacturing Sales(MS), Quantum of Domestic Traffic(QDT), Monetary Aggregate(M1B), Term architecture of interest rate(TS) and
Short term interest rate(ST).
Most of the macroeconomic variables are announced once a month. But technical indicators are announced each transaction day. After applying dynamic fuzzy model, technical indicators and macroeconomic variables can be integrated as input variables for SVM to forecast the dynamism of next transaction day. Part of integrated input variables are shown as follows.
Table 5. Part of integrated input variables used in this study
stock_day ST GDP WPC OBV_AVG DI EMA_12 EMA_26
20031226 0 0 0 18525782.83 5857.94 5845.78 5861.74
20031229 0 0 0 18094523 5822.22 5842.16 5858.82
20031230 0 0 0 18610327.42 5858.41 5844.66 5858.79
20031231 0 0 0 18540988.33 5887.74 5851.28 5860.93
20040102 0 0 0 19199986.58 6008.39 5875.45 5871.85
20040105 0 0 0 20190179.33 6112.65 5911.95 5889.69
20040106 0 0 0 20149974.25 6142.18 5947.37 5908.39
20040107 1 0 1 20143228.42 6157.08 5979.63 5926.81
20040108 1 0 1 20034295.92 6167.55 6008.54 5944.65
20040109 1 0 1 20130689.92 6229.89 6042.59 5965.77
20040112 2 0 1 19305360.92 6220.61 6069.98 5984.65
20040113 2 0 2 19294796.83 6218.01 6092.75 6001.94
20040114 3 0 2 19390892.83 6259.75 6118.45 6021.03
20040115 4 0 2 19376446.5 6272.33 6142.12 6039.65
20040116 4 0 2 19346705.5 6278.73 6163.14 6057.36
20040127 4 0 2 19452628.42 6375.98 6195.88 6080.96
20040128 4 0 2 19530511.25 6382.37 6224.57 6103.29
20040129 4 1 2 18373702.92 6326.89 6240.31 6119.85
20040130 4 1 2 18297088.58 6366.52 6259.73 6138.12
Part of forecast result is shown as follows.
Table 6. Part of forecast result No Predict index Predict
dynamic
Actual
dynamic Actual index Accuracy
1 5765.85 -244.19 -55.86 5,936.46 1
2 5756.8 -9.05 -77.89 5,858.57 1
3 5935.43 178.63 -88.81 5,769.76 0
4 5719.41 -216.02 50.31 5,820.07 0
5 5930.99 211.58 33.74 5,853.81 1
6 5814.42 -116.57 -32.48 5,821.33 1
7 5678.09 -136.33 -83.08 5,738.25 1
8 5883.94 205.85 61.22 5,799.47 1
9 5585.64 -298.3 -1.68 5,797.79 1
10 5558.8 -26.84 -151.91 5,645.88 1
11 5686.65 127.85 -108.23 5,537.65 0
12 5721.34 34.69 92.46 5,630.11 1
13 5789.33 67.99 44.25 5,674.36 1
14 5751.86 -37.47 94.09 5,768.45 0
15 5807.23 55.37 16.96 5,785.41 1
16 5776.08 -31.15 -27.17 5,758.24 1
17 5861.31 85.23 31.09 5,789.33 1
18 5862.94 1.63 55.53 5,844.86 1
19 6112.29 249.35 -28.12 5,816.74 0
20 5645.47 -466.82 7.10 5,823.84 0
21 5684.15 38.68 -76.94 5,746.90 0
22 5778.06 93.91 32.09 5,778.99 1
23 5778.06 0 12.65 5,791.64 0
24 5701.19 -76.87 -69.02 5,722.62 1
0.66(16/24)
4.4 Technical Indicators
The input variables in the study include a total of 61 variables including technical indicators in stock market and futures market and the macroeconomic indicators in Taiwan. 20 Technical indicators are used in this research as input variables. 9 popular technical indicators’s formulas and explain are shown as below[26]:
1. On balance volume(OBV):
The On Balance Volume (OBV) indicator, developed by Joe Granville, measures the strength of the prevailing trend and provides alerts to possible breakouts. OBV is calculated as continuous consecutive sum of volumes. If the current period's close is higher than the previous period's close, the current period's volume is added to the previous period's OBV. On the other hand, if the current period's close is lower than the previous period's close, the current period's volume is subtracted from the previous period's OBV. An unchanged close is neither added nor subtracted from the OBV value.
The formula for the OBV indicator is:
The OBV shows upward momentum if the new high or low is greater than the previous one. On the other hand, downward momentum is evident if the new high or low is lower than the previous one. A change from upward to downward momentum signals implies a trend reversal maybe on the horizon. If the OBV stagnates for more than 3 periods, it is considered to be in a sideways market and the previous trend has changed from either bullish or bearish to neutral.
2. Momentum(MTM)
The Momentum indicator seeks to predict future trends on recent price and volume action.
It is currently one of the most widely used technical studies because it is easy to calculate and it can be applied in a number of different ways.
The formula for the Momentum indicator is:
Momentum is an oscillator-type indicator used to detect overbought and oversold conditions and to perform as a gauge indicating the strength of the current trend.
Momentum calculations are either positive or negative and fluctuate around a zero-line.
When momentum is above the zero-line and rising, prices are increasing at an increasing rate. If momentum is above the zero line but is declining, prices are still increasing but at a decreasing rate.
On the other hand, when momentum is below the zero-line and falling, prices are decreasing at an increasing rate. If momentum is below the zero line but is rising, prices are still declining but at a decreasing rate.
With the momentum indicator, traders usually enter the market when the momentum crosses over the zero-line from negative territory and exit the market when the momentum crosses over the zero-line from positive territory.
3. Relative strength index(RSI):
The Relative Strength Index (RSI), developed by Welles Wilder, is a special form of the Momentum indicator and measures an instrument's internal strength compared to past prices. The calculation of the RSI takes a few of steps. First, positive closing prices (i.e.
positive day change) and negative closing prices (i.e. negative day change) are added and then divided by the number of periods less then one. The result is the period's mean value of upward and downward strength of the underlying instrument. The relative strength is then derived from a ratio of the upward and downward mean.
The formula for the RSI is:
The RSI fluctuates between the values of 0 to 100. A high RSI, readings over 70, suggests an overbought or weakening rally, but does not necessarily mean a top.
Conversely, a low RSI, below 30, implies an oversold market or weakening sell-off, but
does not necessarily imply a market bottom. A 50 reading can serve as a zero-line in other oscillators. Crossing the line from above or below can serve as a signal to enter the market.
Divergence can also be implied by the RSI. For example, the market makes new highs during a rally but the RSI fails to exceed its previous highs. This may indicate weakening of the rally.
4. Moving average convergence and divergence(MACD)
The Moving Average Convergence Divergence (MACD), developed by Gerald Appel, is both an oscillator and a trend indicator. It is the difference between a fast Exponential Moving Average (EMA) and a slow Exponential Moving Average and the fast Moving Average is continually converging towards or diverging away from the slow Moving Average. A third Exponential Moving Average, or signal line, is then plotted to identify changes in trends and market sentiment.
The formula for the MACD is:
The MACD study can be used to identify buy and sell signals. When the MACD crosses above the signal line, it may be time for the longs to enter the market, whereas
when a cross below the signal line occurs, it may be time for the shorts to enter the market.
The MACD study can also be used as an oscillator, an indicator that fluctuates above and below a zero-line, to signal overbought and oversold conditions. When both lines are below zero, it is considered an oversold condition, signalling a buying opportunity, whereas if both lines are above zero, it is considered an overbought condition, signalling a selling opportunity.
Divergence can also be identified with the MACD. A positive divergence occurs when the price is making new lows while the MACD fail to reach new lows. On the other hand, negative divergence occurs when the price is making new highs without being accompanied by new highs from the MACD.
5. Exponential Moving Average(EMA)
The Exponential Moving Average (EMA) finds the average price of a security over a set number of periods. It gives more weight to the more recent prices, relative to older prices, in an attempt to reduce the lag associated with moving averages, in general. The weighting applied depends on the length of the moving average specified. The shorter the EMA is, the more weight that will be applied to the most recent price. The oldest price data in the EMA is never removed, but they have only a minimal impact on the EMA.
The formula for the EMA is:
6. Williams(WMS)
The Williams' %R, developed by Larry Williams, is used to measure whether or not a security is overbought or oversold. The indicator is designed to show the relationship between the period high and the current close within the specified period. The %R is plotted on an upside down scale with 0 at the top and -100 at the bottom.
The formula for the %R is:
Traders usually consider values -20 or higher to be overbought and values -80 and under oversold. Note that the price can remain overbought, or oversold, for a long period of time even though the price continues to rise or fall.
7. Moving average(MA)
A Moving Average (MA) finds the average price of a security over a set number of periods. The calculation of the MA is like the name suggests, simple. The mean of the underlying financial instrument is calculated over a period of time. Prices during this period area are added and then divided by the total number of time periods. Every bar is thus given the same weighting.
The formula for the MA is:
Most traders use the MA as a crossover trading system. Two MAs are plotted and the shorter period MA is used as the signal line. For example, if the shorter period MA crosses over the longer period MA from below to above, then it is considered bullish and a buy opportunity. Conversely, if the shorter period MA crosses over the longer period from above to below, then it is considered bearish and a sell opportunity.
The MA is also used as a support and resistance level. If the price moves away from the MA and retraces back, more often than not, the MA will prove to be a strong support or resistance, depending on the prevailing trend. Note that only certain common MAs can be used for this purpose. 50 and 200 bars are commonly used to measure support and resistance.
8. Advance decline ratio(ADR)
The Advance/Decline Ratio (A/D Ratio) is the number of advancing issues divided by the number of declining issues. This makes it similar to the Advancing/Declining Issues - it shows market breadth (strength). The fact that a division is used, however, makes it independent of the number of issues in total (which has increased steadily over the years).
Often, a moving average of the A/D Ratio is used to indicate an overbought/oversold condition - high values mean a rally is 'overdone' and likely to adjust. In the same way, low readings mean an oversold market with a rally on the way.
The formula for the ADR is:
ADR FORMULA
(The number of issues in a market or index that have increased in price) /( the number of issues that decline in price)
9. Demand Index(DI)
Demand Index, DI, incorporates price and volume to give a ratio of buying pressure to selling pressure. DI is charted on an open scale and fluctuates above and below a zero line. When buying pressure is greater than selling pressure, the DI is above the zero line and vice versa. DI is one of the early volume indicators, developed in the 1970s by James Sibbet.
Many experienced traders feel that weekly studies can be particularly important in identifying the predominant trend, and DI is often assessed using weekly data.
The formula for the DI is:
DI FORMULA
(highest index + lowest index+2*close index)/4
4.5 Experimental Result
The comparisons results of forecast performances between integrated model and traditional forecast methods (buy and hold, Discriminant Analysis and Artificial Neural Network) with different forecast models and input variables are shown in Table 1.
Marked input variables of technical indicators of stock and futures markets and macroeconomic indicators represent that they serve as input variables of forecast model.
The buy-and-hold method poses the worst forecast performance at accuracy merely 50%. As the method was derived from random walk hypothesis, all the supposed
information was reflected in the market transaction prices. In other words, future dynamism is random and unpredictable.
The ANN with fuzzy model has forecast average accuracy as high as 59.25%. With GA, it will have forecast average accuracy as high as 63.25%. In the experiment of two-period with SVM, the average accuracy is 62.3%. In multiperiod experiment the average accuracy is 64.6%.
With any influence factors as input variables, SVM poses average accuracy rate of 63.4%, outperforming 58.6% of ANN and 52.6% of DA. This is because that SVM locates the learning deviation with generalization theories, instead of reducing training deviation in ANN. Thus, overfitting issue from high variable dimensions can be avoided.
Such a feature makes SVM perform better in stock market dynamic forecast.
The SVM with fuzzy model in the study, with experiments of two-period and multiperiod methods, the forecast average accuracy is 70.25% and 71% respectively.
With GA into dynamically adjusted fuzzy model, the forecast accuracy rises to 70.75%
and 75%.
The forecast model in this study boasts the best forecast performance. From the experiment data, when the study includes three kinds of influence factors in proposed integrated model, the forecast accuracy is higher than that from single or two kinds of influence factors. This shows that more input variables help integrated forecast model reflect the relationship between stock market fluctuations. However, when ANN includes two or three kinds of influence factors, the difference between its average accuracy (59.25%)and those with single influence variable(58.6%) is merely 0.65%. This might be due to overfitting of ANN from too many noises.
In this study, GA dynamically adjusts the influential degree of each variable to reflect market changes. Without GA, influence of each variable μA(t) is 1. That is, prior to availability of next new values, the influential degree of each variable remains unchanged.
From the experiment data, for models in multiperiod experiment, GA improves the accuracy of 4%(71% to 75%). It does not provide significant help in models in two-period method. The average accuracy increase is only 0.5%(70.25% to 70.75%).
This is resulted from two-period method being unable to precisely simulate market fluctuations. Therefore, GA and multiperiod method bring high accuracy.
Table 7. Forecasting performance of different input variables with various forecast models
Input variable Accuracy
Hit ratio(%) Prediction
model
Technical indicators in stock market(20)
Macro -econ omic (21)
Technical indicators in future market (20)
multi period
two period
v v v 79 73
v v 77 70
v v 74 71
Proposed model
v v 70 69
Average 75 70.75
v v v 73 73
v v 72 70
V v 70 69
SVM with fuzzy model
v v 69 69
Average 71 70.25
v 67 65
v 69 66
SVM
v 58 56
Average 64.6 62.3
v v v 63
v v 63
v v 64
ANN with Fuzzy
model and
GA v v 63
Average 63.25
v v v 61
v v 62
v v 58
ANN with Fuzzy
model
v v 56
Average 59.25
v 61
v 62
ANN
v 53
Average 58.6
v 53
v 54
DA
v 51
Average 52.6
BH 50
Comparisons in graph are as shown in Figure 7 to Figure 9.
S+M+F S+M M+F S+F 60
65 70 75 80
ProposedModel SVM with fuzzy model
Fig. 7. Comparison between proposed model and SVM with fuzzy model under multiperiod method
S+M+F S+M M+F S+F
0 20 40 60 80
ProposedModel SVM with fuzzy model
ANN with Fuzzy model and GA ANN with Fuzzy model
Fig. 8. Comparison among each model under twoperiod method and integrated different influence factors
SVM ANN DA BH
S F 0
20 40 60 80
S M F
Fig. 9. Comparison between each model under twoperiod method and single kind of influence factors
Chapter Five Conclusion
We propose a forecast model integrating fuzzy theory, GA and SVM to forecast movements of stock market in Taiwan. The new dynamic fuzzy model proves not only effectively simulating market volatility but also covering influence factors of different features. The integrated high dimension variable, with features of SVM, increases the forecast accuracy of the integrated model. The higher stock market forecast dynamism accuracy represents that the forecast model better evaluates the internal mechanism of the market. The integrated forecast model in this study can serve as a valuable evaluation reference for researches on internal mechanism of stock market.
In this study, our main purpose is to design a forecast model to integrate various factors to deal the dynamic of stock market. However, due to the complicate and dynamic nature of stock market, merely estimating influence degree of each factor is not enough.
In real world, each factor may interact in every moment. In the future, factors interaction should be study more details.
References
1. Black, A.J., Mcmillan, D.G.: Non-linear Predictability of Value and Growth Stocks and Economic Activity. Journal of Business Finance & Accounting, Vol.31. Blackwell Publishing, Oxford(2004) 439-474
2. Lo, A., Mamaysky, H., Wang, J.: Foundations of Technical Analysis: Computational Algorithm, Statistical Inference, and Empirical Implementation. The Journal of Finance, Vol. 55. Blackwell Publishing, Oxford(2000) 1705-1765
3. Lo, A.: The Adaptive Markets Hypothesis: Market Efficiency from an Evolutionary Perspective. Journal of Portfolio Management, Vol. 30. Institutional Investor, NewYork(2004) 15-44
4. Kou, R.J.: A Decision Support System for The Stock Market through Integration of Fuzzy Neural Networks and Fuzzy Delphi. Applied Artificial Intelligence, Vol. 12.
Taylor & Francis Group, Philadelphia(1998) 501-520
5. Armano, G., Murru, A., Roli, F.: Stock Market Prediction By A Mixture of Genetic-Neural Experts. International Journal of Pattern Recognition and Artificial Intelligence, Vol. 16. World Scientific Publishing, Singapore(2002) 501-526
6. Matilla-Garcia, G.,Arguellu, C.: A Hybrid Approach based on Neural Networks and Genetic Algorithm to the Study of Profitability in the Spanish Stock Market. Applied Economics Letter, Vol. 12. Routledge part of the Taylor & Francis Group, Philadelphia(2005) 303-308
7. Oh, K.J., Kim, K.J.: Analyzing Stock Market Tick Data Using Piecewise Nonlinear Model. Expert Systems with Application, Vol. 22. Elsevier Science, Oxford(2002) 249-255
8. Azeem, M.F., Hanmandlu, M., Ahmad, N.: Evolutive Learning Algorithm for Fuzzy Modeling. International Journal of Smart Engineering System Design, Vol. 5. Taylor
& Francis Group, Philadelphia(2003) 205-224
9. Huang, W., Nakamori, Y., Wang, S.Y.: Forecasting Stock Market Movement Direction with Support Vector Machine. Computers and Operations Research, Vol. 32. Elsevier Science, Oxford(2004) 2513-2522
10. Yu, L., Wang, S., Lai, K.K.: Mining Stock Market Tendency Using GA-Based Support Vector Machines. Lecture Notes in Computer Science, Vol. 3828.
Springer-Verlag, Berlin Heidelberg(2005) 336-345
11. Schwert, G.W.: Why Does Stock Market Volatility Change Over Time. The Journal of Finance, Vol. 44. Blackwell Publishing, Oxford(1989) 1115-1167
12. Cristianini, N., Taylor, J.S.: An Introduction to Support Vector Machines. Cambridge University, New York(2000)
13. Noever, D., Baskaran, S.: Genetic Algorithms Trading on S&P 500. The Magazine of Artificial Intelligence in Finance, Vol. 1. Miller Freeman, San Francisco(1994) 41-50 14. Mahfoud, S., Mani, G.: Financial Forecasting Using Genetic Algorithms. Applied
Artificial Intelligence, Vol. 10. Taylor & Francis Group, Philadelphia(1996) 543-565 15. Muhammad, A., King, G.A.: Foreign Exchange Market Forecasting Using
Evolutionary Fuzzy Networked. Proc. IEEE 1997 Computational Intelligence for Financial Engineering, March 1997, 213-219
16. Kai, F., Xu, W.: Trading Neural Network with Genetic Algorithms for Forecasting the Stock Price Index. Proc. IEEE 1997 Int. Conf. Intelligent Processing Systems 1, Beijing China, Oct. 1997, 401-403
17. Vapnik V.N.: Statistical Learning Theory. New York: Wiley; 1998.
18. Vapnik V.N.: An Overview of Statistical Learning Theory. IEEE Transactions of Neural Networks, Vol. 10. IEEE(1999) 988–99.
19. Cao L.J., Tay, F.: Financial Forecasting Using Support Vector Machines. Neural Computing Applications, Vol. 10. (2001) 184–92
20. Tay, F., Cao L.J.: Application of Support Vector Machines in Financial Time Series Forecasting. Vol.29. Omega(2001) 309–17
21. Tay, F., Cao L.J.: A Comparative Study of Saliency Analysis and Genetic Algorithm for Feature Selection in Support Vector Machines. Vol.5. Intelligent Data Analysis(2001) 191–209
22. Tay F., Cao L.J.: Improved Financial Time Series Forecasting by Combining Support Vector Machines with Self-Organizing Feature Map. Vol.5. Intelligent Data Analysis (2001) 339–54
23. Tay F., Cao L.J.: Modified Support Vector Machines in Financial Time Series
Forecasting. Vol.48. Neurocomputing(2002) 847–61
24. Min, J.H., Najand, M.: A Futher Investigation of the Lead-Lag Relationship between the Spot Market and Stock Index Future: Early Evidence From Korea. Journal of Futures Markets, Vol. 19. John Wiley & Sons, New Jersey(1999) 217-232
25. Dickinson, D.G..: Stock Market Integration and Macroeconomic Fundamentals: An Empirical Analysis. Applied Financial Economics, Vol. 10. Routledge part of the Taylor & Francis Group, Philadelphia(2000) 261-276
26. Edwards, R. D.: Technical Analysis of Stock Trend. Boston, Mass. J. Magee Inc. New York(1992)
Appendix A Introduction to Fuzzy Theory
Fuzzy Sets
Fuzzy Set Theory was formalized by Professor Lofti Zadeh at the University of California in 1965. What Zadeh proposed is very much a paradigm shift that first gained acceptance in the Far East and its successful application has ensured its adoption around the world.
A paradigm is a set of rules and regulations which defines boundaries and tells us what to do to be successful in solving problems within these boundaries. For example the use of transistors instead of vacuum tubes is a paradigm shift - likewise the development of Fuzzy Set Theory from conventional bivalent set theory is a paradigm shift.
Bivalent Set Theory can be somewhat limiting if we wish to describe a 'humanistic' problem mathematically. For example, Fig 1 below illustrates bivalent sets to characterize the temperature of a room.[1]
The most obvious limiting feature of bivalent sets that can be seen clearly from the diagram is that they are mutually exclusive - it is not possible to have membership of more than one set ( opinion would widely vary as to whether 50 degrees Fahrenheit is 'cold' or 'cool' hence the expert knowledge we need to define our system is mathematically at odds with the humanistic world). Clearly, it is not accurate to define a