Extracting fuzzy relations in fuzzy time series model based on approximation concepts

(1)

Extracting fuzzy relations in fuzzy time series model based

on approximation concepts

Tung-Kuan Liu, Yeh-Peng Chen, Jyh-Horng Chou

⇑

Institute of Engineering Science and Technology, National Kaohsiung First University of Science and Technology, 1 University Road, Yenchao, Kaohsiung 824, Taiwan, ROC

a r t i c l e

i n f o

Keywords: Fuzzy time series Rough set theory Data mining

Fuzzy rough time series Approximation reasoning

a b s t r a c t

The deriving of fuzzy relationships is an essential task in fuzzy time-series forecasting studies; many studies have been devoted to discovering fuzzy relationships using less computational effort. In this paper, we also aim to improve the derivation of fuzzy relationships, and compare the results to previous studies. The proposed model in this paper not only requires no prior knowledge or pre-review dataset to generate heuristic rules, but also effectively reduces computational effort by decreasing the quantity of fuzzy sets of linguistic variables. The rough set classiﬁer is introduced to discover fuzzy relationships ﬁrst when a time-invariant relation is derived. The empirical results show that the proposed model’s MSE (mean square error) is 79,040, the MAPE (Mean absolute percentage error) is 1.47% and the time com-plexity outperforms previous models and yields the best known result.

1. Introduction

Forecasting has always played an important role and will continue to do so into the future. In last few decades, a consider-able number of approaches to predict various practical problems have been presented, such as domain expert approaches, statistic approaches, time series methods and artiﬁcial intelligence approaches. Nevertheless, response more and more complex prob-lems in real world, for example the quantity of historical data is less of a limitation of traditional statistics, or the historical data with linguistic terms. To overcome these drawbacks, a fuzzy time series model has been proposed in the literatures (Song & Chissom, 1993a, 1993b, 1994).

Based on this concept, a number of improved models have been reported, which can be applied to various domains, such as enrollments, meteorology, stocks market, human resource and the tourism industry (Chen, 1996; Chen & Hwang, 2000; Huarng, 2001a, 2001b; Huarng, 2006; Hwang, Chen, & Lee, 1998; Lee, Chiu, & Lin, 2001; Song & Chissom, 1993a, 1993b; Singh, 2007; Tsaur, O Yang, & Wang, 2005; Wang & Hsu, 2008). The studies can be di-vided into four categories, (1) those that enhance the accuracy of model by changing interval length in partition step of the universe of discourse in Song and Chissom’s model, (2) those that establish-ment of fuzzy relationships with less efforts and better perfor-mance, (3) those involving multi-variables model, and (4) those

forecasting fuzzy time series dataset by different approaches such as neural networks in defuzzication steps, separately.

Many of excellent studies have been presented for fuzzy time series model. In particular, we are interested in articles that have devoted to the study of deriving of fuzzy relationships with less computational efforts. Chen (1996) presented a simpliﬁed approach to derive fuzzy relationships by arithmetic operation rather than max–min operation, and not only simpliﬁed but also achieved better accuracy better than Song and Chissom’s model.

Huarng (2001a, 2001b) integrated problem-specific heuristic knowledge as a function with Chen’s model to improve forecast accuracy.Tsaur et al. (2005) employing the concept of entropy, evaluate the degree of fuzziness and determine the time t of the data, and this approach was study based on time-invariant fuzzy time series.Yu (2005)pointed out recurrent issue that had previ-ously been ignored, and also considered this ignoring might not properly reflected the importance of each individual fuzzy rela-tionship, thus, he proposed a weighted model to reflect the impor-tance of fuzzy relation with high frequency.Singh (2007)used a difference parameter as a fuzzy relationship for forecasting, and provided a crisp output of better accuracy with minimizes the time of defuzzification procedure.Teoh et al. (2008)integrated cumula-tive probability distribution approach (CPDA) and rough set theory into FTS for stocks market prediction. The CPDA method was used to determine the length of partition intervals, the rough set was used to generate rules after fuzzy relationships discovered. Never-theless, the rough set theory as an effective tool for attributes reduction removes redundant objects and pattern recognizers. Once the fuzzy relationships were obtained, the rules can be easily generated without rough set theory assistance that the knowledge

⇑Corresponding author. Tel.: +886 7 6011000, fax: +886 7 6011066.

E-mail addresses: [email protected], [email protected] (J.-H. Chou).

Expert Systems with Applications 38 (2011) 11624–11629

Contents lists available atScienceDirect

Expert Systems with Applications

(2)

table consist of one condition attribute and one decision attribute only; some kind of degree, it may not greatly differ fromChen’s model (1996). Consequently, although the rough set theory was employed in their model to generate rules, but this paper is belong to those studies that enhance the accuracy by changing the interval length in partitioning step rather than those studies that establish-ment of fuzzy relationships with less efforts and better perfor-mance. Comparing to previous studies, the proposed model in this paper not only no needs prior knowledge or pre-review data-set to generate heuristic rules, but also effectively reduces compu-tational efforts by reducing (decreasing) the quantity of fuzzy set of linguistic variable.

The remainder of this paper is organized as follows: In Sections2 and 3, the related work such as the basic concept of fuzzy time ser-ies, and current studies of rough set are reviewed. In Section4, we propose an efﬁcient method to enhance the fuzzy time series model. In Section5, an example of enrollment data of the University of Alabama is used to illustrate the model performance. In Section6, we demonstrate how the proposed model forecasts better than pre-vious models. The conclusions are discussed in Section7.

2. Fuzzy relationship problems and methodologies

Two issues can be improved from Song and Chissom’s model, one is the defuzzication step because different interpretations may lead to different forecasting results (Song & Chissom, 1993a, 1993b), and the other is reducing the computational efforts for dis-covering fuzzy relationships. Thus, a better performance model needs to be developed. In Song and Chissom’s original two models, either max–min or min–max operators could be utilized to deal with the computation of fuzzy relations, but both models would require huge computational efforts. Moreover, if the number of intervals of the deﬁned fuzzy sets is too many, and then the derived fuzzy relationships would increase computational cost due to large number relationships between two consecutive fuzzy sets. Hence, many studies have focused on discovering fuzzy rela-tions with less effort (Chen, 1996; Huarng, 2001a, 2001b; Huarng, 2006; Singh, 2007; Tsaur et al., 2005; Yu, 2005). Nevertheless, some of those models require prior knowledge or pre-review data set to generate heuristic rules, and some of their results are unsatisfying with one to many relationships in previous studies (Chen, 1996; Huarng, 2001a, 2001b; Yu, 2005).

3. Related works

3.1. Basic concept of fuzzy time series

Song and Chissom (1993a, 1993b, 1994)proposed a novel fuzzy time series model, and it was constructed based on the concept of an existing a fuzzy relationship between two consecutive time ser-ies data that at time t and t + 1, If we can find the fuzzy relationship then we can forecast in fuzzy time series, which differs greatly from traditional time series concepts, such as Fuzzy ARIMA. It is comprises seven steps: (1) define the universe of discourse; (2) partition the universe into several equal lengthy intervals; (3) define fuzzy sets on the universe; (4) fuzzify the historical data; (5) obtain the historical knowledge from fuzzified historical data (establishment of fuzzy relations); (6) calculate the forecasted outputs; (7) interpret the forecasted outputs. Song and Chissom’s previous studies of fuzzy time series defined it as follows:

Let U be the universe of discourse, where U = {u1, u2, . . . , un}. A

fuzzy set Aiof U is deﬁned by

Ai¼ fAiðu1Þ=u1þ fAiðu2Þ=u2þ þ fAiðunÞ=un ð1Þ

where fAiis the membership function of fuzzy set Ai, fAi: U ? [0, 1],

ukis the element of fuzzy set Aiand fAi(uk) is the degree of

belong-ingness of ukto Ai. fAi(uk)

e

[0, 1] where 1 6 k 6 n.

Deﬁnition 1 (Song and Chissom (1993a, 1993b)). Y(t), t = 0, 1, 2, . . . , n, is a subset of R. Let Y(t) be the universe of discourse deﬁned by fuzzy set fi(t), i = 0, 1, 2, . . . , n. If F(t) consists of fi(t), i = 0, 1, 2, . . . , n,

F(t) is deﬁned as a fuzzy time series on Y(t), t = 0, 1, 2 . . . , n.

Deﬁnition 2 ((Song & Chissom, 1993a, 1993b)). If there exists a fuzzy relationship R(t 1, t), such that F(t) = F(t 1) R(t 1, t) where represents an operator, than F(t) is said to be caused by F(t 1). When F(t 1) = Aiand F(t) = Aj, the relationship between

F(t 1) and F(t) called a fuzzy logical relationship, it is denoted by Ai?Aj. In Song and Chissom’s model the operator can be

either max–min or min–max.

Deﬁnition 3 (Song and Chissom (1993a, 1993b)). Let F(t) be a fuzzy time series. If F(t) is caused by F(t 1), . . . , F(t 2), . . . , and F(t n), then this fuzzy logical relationship is represented by

Fðt nÞ; . . . ; Fðt 2Þ; Fðt 1Þ ! FðtÞ ð2Þ

and it is called the nth order fuzzy time-series forecasting model.

3.2. Rough sets theory

The rough sets theory was firstly proposed by Pawlak in the early 1980’s. It has been widely and successfully used in various domains such as information science, electrical, environmental, engineering, medicine, economics, finance, business, social sci-ences, chemistry and decision analysis. it well deal with the classi-fication analysis of data tables, remove redundant conditional attributes by two approximation concepts, called lower and upper approximations, and is based on the original data only and does not require external information. Some of the key concepts of rough sets theory are introduced briefly as follows:Let I = (U, A) be an information system, where U is a non-empty set of finite ob-jects, called the universe; A is a non-empty finite set of attributes such that a : U ? Vafor every a

e

A; Vais the value set for attribute

a. In a decision system, A = {C [ D} where C is the non-empty set of conditional attributions and D is a non-empty set of decision attributes.

3.2.1. Indiscernibility

For all P # A there is an equivalence relation IND(P):

INDðPÞ ¼ fðx; yÞ 2 U2j

8

a 2 P; aðxÞ ¼ aðyÞg: ð3Þ

If (x, y)

e

IND(P), x and y are indiscernible by the attributes from P. The partition of U, generated by IND(P) is denoted U/P and can be calculated as follows:

U=P ¼ fa 2 P : U=INDðfagÞg ð4Þ

3.2.2. Set approximation

The equivalence classes of the P-indiscernibility relation are denoted [x]p. Let X # U, the P-lower approximation, upper

approx-imation and boundary region of X can be deﬁned as:

PX ¼ fxj½xp#Xg ð5Þ

PX ¼ fxj½xp\ X – /g ð6Þ

BN ¼ PX PX ð7Þ

(3)

The comparative study of MAPE (mean absolute percentage er-ror) and MSE (Mean square erer-ror) as show inTable 5, and time complexity, as show inTable 6demonstrate that the forecast using the proposed method has a higher accuracy than other methods. The forecasted trends of the proposed method and others methods are being shown inFig. 1.

7. Conclusion

In this paper, a novel approach is proposed to forecast enroll-ments that based on a fuzzy time series. From the empirical anal-ysis, inTables 5 and 6we can see that MAPE of 1.47%; a MSE of 79,040, and the time complexity. Obviously, the proposed model outperformed the previous models and yielded the best result. Moreover, the proposed model not only requires no prior knowl-edge or pre-review dataset to generate heuristic rules, but also effectively reduce computational efforts.

However, there is an aspect for applying our proposed model, since the universe of discourse of observations shows an obvious growing trend or beyond the original scale of the universe of discourse. Our experience shows that re-running the algorithm is necessary, and we think this aspect would also exist in any forecasting model as pattern recognizers.

In this paper, we have been applied the proposed model to time-invariant observations with linguistic values, and we believe that the proposed model can also be utilized with time-variant observations, and future efforts will focus on this aspect.

Acknowledgement

This work was in part supported by the National Science Council, Taiwan, Republic of China, under Grant No. NSC 99-2221-E327-043-MY3.

References

Chen, S. M. (1996). Forecasting enrollments based on fuzzy time series. Fuzzy Sets and Systems, 81(3), 311–319.

Chen, S. M., & Hwang, J. R. (2000). Temperature prediction using fuzzy time series. IEEE Transactions on System, Man, and Cybernetics – Part B, 30(2), 263–275. Huarng, K. (2001a). Heuristic models of fuzzy time series for forecasting. Fuzzy Sets

and Systems, 123, 369–386.

Huarng, K. (2001b). Effective length of intervals to improve forecasting in fuzzy time series. Fuzzy Sets and Systems, 123, 387–394.

Huarng, K. (2006). Ration-based lengths of intervals to improve fuzzy time series forecasting. IEEE Transactions on System, Man, and Cybernetics – Part B, 36(2), 328–340.

Hwang, J. R., Chen, S. M., & Lee, C. H. (1998). Handling forecasting problems using fuzzy time series. Fuzzy Sets and Systems, 100, 217–228.

Lee, T. S., C Chiu, C., & Lin, F. C. (2001). Prediction of the unemployment rate using fuzzy time series with Box–Jenkins methodology. International Journal of Fuzzy Systems, 3(4), 577–585.

Singh, S. R. (2007). A simple method of forecasting based on fuzzy time series. Applied Mathematics and Computation, 186, 330–339.

Song, Q., & Chissom, B. S. (1993a). Fuzzy time series and its models. Fuzzy Sets and Systems, 54, 269–277.

Song, Q., & Chissom, B. S. (1993b). Forecasting enrollments with fuzzy time series. Fuzzy Sets and Systems – Part I, 54, 1–9.

Song, Q., & Chissom, B. S. (1994). Forecasting enrollments with fuzzy time series. Fuzzy Sets and Systems – Part II, 62, 1–8.

Teoh, H. J., Cheng, C. H., Chu, H. H., & Chen, J. S. (2008). Fuzzy time series model based on probabilistic approach and rough set rule induction for empirical research in stock markets. Data and Knowledge Engineering, 67, 103–117. Tsaur, R. C., O Yang, J. C., & Wang, H. F. (2005). Fuzzy relation analysis in fuzzy time

model. An International Journal Computers and Mathematics with Applications, 49, 539–548.

Wang, C. H., & Hsu, L. C. (2008). Constructing and applying an improved fuzzy time series model: Taking the tourism industry for example. Expert Systems with Applications, 34, 2732–2738.

Yu, H. K. (2005). Weighted fuzzy time series models for TAIEX forecasting. Physica A, 349, 609–624.

Table 6

Comparison MSE of forecasted enrollments by different models.

S&C time-invariant S&C time-variant Singh (2007) Chen (1996) The proposed model Time complexity O(kn2

) O(kn2

) O(n) O(p) O(r)

Note: k denotes the number of fuzzy logical relationships, n denotes the number of elements in the universe of discourse, p denotes the number of fuzzy logical relationship groups, and r denotes the number of rules.

Fig. 1. Actual enrollments and forecasted enrollments by different models.