An Enhanced HMM-Based for Fuzzy Time Series Forecasting Model
Yi-Chung Cheng1 Pei-Chih Chen2 Chih-Chuan Chen3 Hui-Chi Chuang4 Sheng-Tun Li4
1Department of International Business Management, Tainan University of Technology, Taiwan, R.O.C.
2Department of Product Design, Tainan University of Technology, Taiwan, R.O.C.
3Department of Leisure and Information Management, Taiwan Shoufu University, Taiwan, R.O.C.
4 Institute of Information Management, National Cheng Kung University, Taiwan, R.O.C.
Abstract
The fast and accurate forecasting method can help mak- ers to make appropriate strategy. Zadeh was given the definition of a fuzzy set in 1965. Song and Chissom proposed the definition and the forecasting framework of fuzzy time series in 1993. Sullivan and Woodall first proposed the forecasting method to handle one factor with probability Markov model in 1994. Li and Cheng proposed a stochastic hidden Markov model which con- siders two factors in 2010. However, an event can be affected by many factors. In this paper, we present a multi-factor HMM-based forecasting, and utilize more factors to get better forecasting accuracy rate.
Keywords: fuzzy time series, forecasting, hidden Mar- kov model (HMM)
1. Introduction
In the big data era, an efficiency forecasting model is very importance for the maker of enterprises or gov- ernment. The forecasting problem of time series is an interesting and important research topic. However, tra- ditional time series is complete developing statistic method, but can’t deal with vague, vocabulary, or un- certainly data. Zadeh (1965) proposed the fuzzy theory, which closed to human think and can description vague and vocabulary variables.
For time series, the observed uncertain value can be modeled as a fuzzy variable, which is so-called fuzzy time series (FTS) (Möller and Reuter, 2007). The term FTS has been used in different meanings (Möller and Reuter, 2008): (1) time series with uncertain single data (fuzzy data) at each point in time (Hareter, 2004a, 2004b); (2) time series with uzzified real-valued single data at each point in time (Song and Chissom, 1993b);
and (3) fuzzy time series based on a set of elementary finite time series and composed of several significant representative courses (Bocklisch and Pässler, 2000).
Based on the fuzzy theory, Song and Chissom (1993) first proposed the fuzzy time series to deal with vague and uncertainly data in time series, and proposed five steps for forecasting. The five steps proposed by Song and Chissom (1993) are: (1) Define and partition the universe of discourse; (2) define fuzzy sets and fuzzify the historical data; (3) construct fuzzy relations; (4)
forecasting and (5) defuzzification. The most fuzzy time series forecasting model based on fuzzy logical relationship (Chen, 1996; Hwang, Chen, and Lee, 1998;
Chen, 2002; Yu, 2005; Lee et al., 2006; Li and Cheng, 2007; Lee, Wang, and Chen, 2007; Cheng, Chen, and Wu, 2009; Wang and Chen, 2009; Wong, Bai, and Chu, 2010; Chen and Chen, 2011; Shan, 2012), which is easy to understand but is not applicable for big data.
Sullivan and Woodall (1994) used Markov’s matrix based probability statistics method to establish one- factor one-order fuzzy time series forecast model. Li and Cheng (2009) proposed a stochastic hidden Markov model (HMM), which takes into consideration the fre- quency of relationships thus only solving the two-factor problem. The hidden Markov models have been exten- sively used in the area like speech recognition, stock (Hassan and Nath, 2005), electrical signal prediction and image processing, etc. But traditional HMM is una- ble to solve the two-factor problem. In this paper we expand Li and Cheng (2009) probabilistic HMM model to forecasting multiple factors problem. And use one actual datum to verification the differences between one observable variable and muti-observable variables. Fi- nal, compare the forecasting accuracy to other model with MAE (Mean Absolute Error), PMAD (Percent Mean Absolute Deviation), MAPE (Mean Absolute Percentage Error), RMSE (Root Mean squared error).
2. Preliminaries and Related Work 2.1. Fuzzy Set and Fuzzy Time Series
Since technological progress and complex, precise val- ue can’t description the nature vague, vocabulary and uncertainly information. Zadeh(1965) given a fuzzy set definition to description the vague, vocabulary and un- certainly information, as follows:
Definition 1. A fuzzy set of universe of discourse is characterized by a membership function which associates with each element of a number in the interval which represents the grade of membership of in . The fuzzy set of
will be denoted
(1)
where stands for union.
And then, Song and Chissom (1993b) proposed the definition of fuzzy time series. The fuzzy time series is a in time sequence, as below:
Definition 2. Let , a subset of R, be the universe of discourse on which fuzzy sets are defined, and let be a collection of . Then,
is called a fuzzy time series on
The fuzzy time series is always assumed there is rela- tionship between time t and t-1. The one-order fuzzy relation equation is defined as:
Definition 3. One-order fuzzy relation equation. Let be a fuzzy time series, the relationship between and is denoted as below:
(2)
where “ ” is a composition operator. Where which is composed of is a one-order fuzzy relation. The relationship shows below:
(3)
where is the fuzzy relation between
and .
Song and Chissom (1993b) first proposed a complete fuzzy time series forecasting model and divided it into five steps. Nevertheless, there are still many details worthy of further exploration on this architecture.
Therefore, many studies follow the framework of Song and Chissom, in order to get better forecasting accuracy.
The five steps are: (1) defining universe of discourse and partitioning into several intervals, (2) defining fuzzy sets and fuzzifying historical data, (3) construct- ing fuzzy relation, (4) forecasting and (5) defuzzifica- tion. On step (3), most paper used fuzzy logical rela- tionship to construct fuzzy relation, and attempted to improve the forecasting accuracy. The fuzzy logical re- lationship is easy to understand but the compute com- plexity is stubbornly high. On the other hand, the prob- ability forecasting model for fuzzy time series, only Sullivan and Woodall (1994) used Markov matrix and Li and Cheng (2009) used HMM to solve forecasting problem, for one and two factors, represent.
2.2. Hidden Markov Model
The hidden Markov model (HMM) is a statistical model used to deal with symbols or signal sequences. There are always three fundamental questions should be solved in HMM:
(1)Evaluation Problem: Given the model λ=(π,A,B) and a sequence of observations Y, find P(Y│λ). The
problem is how to determine the likelihood of the ob- served sequence Y with given model.
(2)Decoding Problem: Given λ=(π,A,B) and an obser- vation sequence Y, and an optimal state sequence for the underlying Markov process. In other words, we want to uncover the hidden part of the Hidden Markov Model that best explains the observation.
(3)Learning Problem: Given an observation sequence Y and the dimensions N and M, and the model λ=(π,A,B) that maximizes the probability of Y. This can be viewed as training a model to best fit the observed data. Alter- natively, we can view this as a (discrete) hill climb on the parameter space represented by π,A,and B . There are many established algorithms to solve the above question. The first problem is determined is the im- portant issue in the paper.
It is very import for an HMM is composed of two states and three probability matrices. Two states are hidden state and observable state. The hidden states set is defined as and the observable
state set is
. The hidden states are probabilistically related to the ob-
servable states, n and are
the number of hidden states and observable states, re- spectively. The three probability matrices are used to describe the relation between hidden and observable states and normally represented as , where is initial state vector, is hidden state transition ma- trix, and is confusion matrix with hidden and observ- able state. The , , and can be defined mathemati- cally as follows:
, ,
(4)
(5)
The probability is the sum of with all possible state sequences, so we obtain:
(6)
(7)
(8)
3. Model Development
The novel forecasting model is also following Song and Chissom (1993b) forecasting framework. Given fuzzy time series of hidden state F(t), and of k observable var- iables, G^1 (t), G^2 (t),…, and G^k (t), which there are n and m_i states for F(t) and G^i (t), i=1,2,…,k, re- spresently.
(9)
Step (1). Defining and partition the universe of dis- course. For hidden and observable states, the universal of discourse U is easy to make as follows:
(10)
Where , , ,…, , , ,
,…, and are the minimum and the maxi- mum in the training date set, , , ,…, , ,
, ,…, and are the two proper positive inte- gers. And use the popular equal length method to define the interval length as follows:
(11)
Step (2). Defining the fuzzy sets and fuzzifying the time series. Given a traditional crisp time series, one needs a fuzzification procedure to obtain the corre- sponding fuzzy time series. For hidden states, n fuzzy sets can be defined on using general membership functions, as expressed below:
(12)
where is the membership degree of belonging to .
For k observable states, fuzzy sets , ,…, and can be defined on , , as expressed below:
= (13)
where is the membership degree of belonging to .
Then hidden historical datum is fuzzified as where the membership degree in interval is maximal.
The observation datum can be fuzzified in the same way as the hidden variables.
Step (3). Constructing an HMM model with .
First, the initial state vector is set to be a 1 n matrix, defined as:
(14)
where is the number of the data whose initial
states are and .
Next, the state transition matrix is a n n matrix defined as below:
(15)
with and , where
denotes that the number of data whose hidden states is at time and at time .
Finally, the confusion matrix is a matrix represented as fol- lows:
(16)
with and , , where means that the number of data whose hidden states is at time t and at time t.
Therefore, the multiple observations HMM can be characterized by the following matrices:
where where , where
is a vector with the probability of initial state. A is the state-transaction matrix which provides information about the relation of two contiguous hidden states. is the confusion matrix which is the relation between ob- servation v and hidden state.
Step (4) Forecasting. there are a lot of alogrithm that can compuate the probability of observation and we can also estmate the next state by getting maximal probability. Here the study just focus on forecasting, so the proposed method just use dynamic programming to calculate maximum likelihood.
Based on dynamic programming method, we con- struct the following equation:
According to the notation of our study, , we then edit the model as fol- lowing:
(17)
Therefore, a particular HMM can be characterized by
and .
We obtain the probability of hidden state with given observations. Therefore, following the sequence of maximal probability, we can reach the forecasting se- quence of hidden state.
As mentioned before, the probability of is de-
termined by and
. One has to compute the probabili- ties of all possible hidden states occurring at time t,
2
t , by considering the transition influence of the previous hidden state and the observation state . Such an influential relation can be repre-
sented by a function , defined
as below:
(18)
where is the xth row of state transaction ma- trix , and is the yth column of confusion matrix . The symbol of operator ‘.*’ denotes an ele- ment-wise multiplication, a conventionally used nota-
tion in Matlab which multiplies two matrices by multi- plying all of the corresponding elements.
However, the sequence we estimated is fuzzy time series. Finally, we need to defuzzy the outcome.
(5) Defuzzification.There are many defuzzification method, we use the most popular one, namely Fuzzy Mean Method, which present as
(19)
is the amount of fuzzy set, is the ith membership degree, is ith the midpoint of interval corresponding to the ith linguistic value.
4. Experiment and Results Analysis
The present section demonstrates the application of the proposed method and compared the accuracy of its forecasted results with those results obtained by one factor only. In order to evaluate the superiority of pro- posed model, we use four evaluation indices to evaluate the performance and the calculation is display as MAE (Mean Absolute Error), PMAD (Percent Mean Absolute Deviation), MAPE (Mean Absolute Percentage Error), RMSE (Root Mean squared error), the indices to assess the forecasting abilities of the proposed model, and these are introduced below:
(1) MAE (Mean Absolute Error)
(20) (2) PMAD (Percentage Mean Absolute Deviation)
(21)
(3) MAPE (Mean Absolute Percentage Error)
(22)
(4) RMSE (Root Mean Squared Error)
(23)
Here, The monthly weather data of Alishan, from 2004 to 2013, and containing one hidden factor, aver- age temperature, and three observable factors: (1) aver- age relative humidity level, (2) number of rainy days, and (3) total sunshine duration. The first seven years,
from 2004 to 2010, are used for training, from 2011 to 2013, are used for testing. The forecasting result is dis- played as following Table 1 and Table 2:
MAE PMAD MAPE RMSE
All 0.877112 0.092451 0.092451 1.350553 Average hu-
midity level 1.23298 0.124322 0.124322 1.699685 Number of
rainy days 0.992742 0.106911 0.106911 1.405663 Total sunshine
duration 1.170567 0.117814 0.117814 1.577392 Table 1 The evaluation of Alishan weather with differ- ent factors
MAE RMSE PMAD
Proposed Model 1.0496 1.3937 0.0927 Chen
(1996) 1.3159 1.6584 0.1145
Hsu et al.
(2003) 1.2397 1.5269 0.1079 Li and Cheng
(2007) 2.0443 2.7463 0.1779
Chen and Chen
(2011) 1.0738 1.4074 0.0923
Table 2 The evaluation of Alishan weather Obviously, in Table 1, the forecasting result proves that the more factors we considered the more precise result we can forecast. In Table 2, the performs of the proposed model is compare to other forecasting model (Chen(1996), Hsu et al. (2003), Li and Cheng (2007), Chen and Chen (2011)). The above illustrate, we proof two point, one is forecasting with multi-observables is better single observable, another is forecasting with multi-observables HMM is better than traditional fuzzy time series forecasting model (with fuzzy logical rela- tionship to construct fuzzy relation model).
5. Conclusions
The constructing fuzzy relationship in fuzzy time series forecasting model, there are two category, fuzzy logical rule and probability. The fuzzy logical rule is easy to understand, but there is one problem, high compute complexity. The probability forecasting model, the pre- vious study on construct relationship by Markov or HMM, the shortcoming is only deal with one factor or one hidden variable and one observable variable. On the other hand, the drawback of previous probability fuzzy time series forecasting model is that can’t forecast with multiple factor data and waste the obtained information.
However, the proposed model solves the problem and demonstrates the indication that “predicting with more factors can improve the forecasting result”.
There are three points can be focused in the future work. One is, in this model, we assume the relations between the observed factors are independent. However some realistic data cannot satisfy with this limitation.
Therefore, we can to consider the impact of coefficient between observed factors to exclude the factor with col- linearity. Another is, the high order HMM fuzzy time series forecasting model is necessary at some time.
Third, the compute complexity also is an important is- sue. In future, the high order muti-factor HMM is must to study, the collinearity is must to exclude, and simple the compute complexity.
References
[1]S. F. Bocklisch and M. Pässler, Fuzzy time series analysis. in: Hampel, R., Wagenknecht, M., Chaker, N. (Eds), Advances in Soft Computing - Fuzzy Con- trol, Physica-Verlag, Heidelberg, pp. 331-345, 2000.
[2]S. Chen, Forecasting enrollments based on fuzzy time series, Fuzzy sets and systems, 81(3), 311–319, 1996.
[3]S.-M. Chen, Forecasting enrollments based on high- order fuzzy time series, Cybernetics and Systems, 33(1), 1–16, 2002.
[4]S.-M. Chen and C.-D. Chen, Handling forecasting problems based on high-order fuzzy logical relation- ships, Expert Systems with Applications, 38(4), 3857–3864, 2011.
[5]Cheng, C.-H., Chen, Y.-S., & Wu, Y.-L. (2009).
Forecasting innovation diffusion of products using trend-weighted fuzzy time-series model. Expert Sys- tems with Applications, 36(2), 1826–1832.
[6]D. Hareter, Time series analysis with non-precise data—Part I. presented at the 9th Special Conference Probability Mechanic Structure Reliability, Sandia Nat. Labs., Albuquerque, NM, 2004.
[7]D. Hareter, Time series analysis with non-precise data—Part II. presented at the 9th Spec. Conference Probability Mechanic Structure Reliability, Sandia Nat. Labs., Albuquerque, NM, 2004.
[8]M. Hassan and B. Nath, Stock market forecasting using hidden Markov model: a new approach. De- sign and Applications, 2005.
[9]K. Huarng and T. H.-K. Yu, The application of neu- ral networks to forecast fuzzy time series, Physica A:
Statistical Mechanics and its Applications, 363(2), 481–491, 2006.
[10]J.-R. Hwang, S.-M. Chen, and C.-H. Lee, Handling forecasting problems using fuzzy time series, Fuzzy Sets and Systems, 100(1), 217–228, 1998.
[11]L.-W. Lee, L.-H. Wang and S.-M. Chen, Tempera- ture prediction and TAIFEX forecasting based on fuzzy logical relationships and genetic algorithms, Expert Systems with Applications, 33(3), 539–550, 2007.
[12]L.-W. Lee, L.-H. Wang, S.-M. Chen, and Y.-H. Leu, Handling forecasting problems based on two- factors high-order fuzzy time series, IEEE Transac- tions on Fuzzy Systems, 14(3), 468–477, 2006.
[13]S.-T. Li and Y.-C. Cheng, Deterministic fuzzy time series model for forecasting enrollments, Comput- ers and Mathematics with Applications, 53(12), 1904–1920, 2007.
[14]S.-T. Li and Y.-C. Cheng, A stochastic HMM- based forecasting model for fuzzy time series.
IEEE transactions on systems, man, and cybernet- ics, Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society, 40(5), 1255–1266, 2010.
[15]B. Möller and U. Reuter, Uncertainly Forecasting in Engineering, Berlin, Germany: Springer-Verlag, 2007.
[16]B. Möller and U. Reuter, Prediction of uncertain structural responses using fuzzy time series, Com- pute Structures, 86, 1123–1139, 2008.
[17] M. Shah, Fuzzy based trend mapping and forecast- ing for time series data, Expert Systems with Appli- cations, 39(7), 6351–6358, 2012.
[18]Q. Song and B.S. Chissom, Forecasting enrollments with fuzzy time series - Part I, Fuzzy Sets and Sys- tems, 54(1), 1-9, 1993a.
[19]Q. Song and B.S. Chissom, Fuzzy time series and its models, Fuzzy sets and Systems, 54(3), 269–277, 1993b.
[20]Q. Song and B.S. Chissom, Forecasting enrollments with fuzzy time series - part II, Fuzzy Sets and Sys- tems, 62(1), 1-8, 1994.
[21]M. Stamp, A revealing introduction to hidden Mar- kov models, Department of Computer Science San Jose State, 1–20, 2004.
[22]J. Sullivan and W. Woodall, A comparison of fuzzy forecasting and Markov modeling, Fuzzy Sets and Systems, 64, 279–293, 1994.
[23]N.-Y. Wang and S.-M. Chen, Temperature predic- tion and TAIFEX forecasting based on automatic clustering techniques and two-factors high-order fuzzy time series, Expert Systems with Applications, 36(2), 2143–2154, 2009.
[24]Y. Wang, X. Hao, X. Zhu, and F. Ye, (2012). An approach of software fault detection based on HMM. In 2012 International Conference on Quali- ty, Reliability, Risk, Maintenance, and Safety Engi- neering, 644–647, 2012.
[25]W.-K. Wong, E. Bai, and A. W.-C. Chu, Adaptive time-variant models for fuzzy-time-series forecast- ing, IEEE transactions on systems, man, and cyber- netics. Part B, Cybernetics, 40(6), 1531–42, 2010.
[26]H.-K. Yu, Weighted fuzzy time series models for TAIEX forecasting, Physica A: Statistical Mechan- ics and its Applications, 349(3), 609–624, 2005.