Improved MS_CMAC neural networks by integrating a simplified UFN model

(1)

DOI 10.1007/s11063-007-9067-4

Improved MS_CMAC Neural Networks by Integrating

a Simplified UFN Model

Jiun-Chi Jan · Shih-Lin Hung

Received: 4 April 2007 / Accepted: 28 November 2007 / Published online: 13 December 2007 © Springer Science+Business Media, LLC. 2007

Abstract Macro_Structure_CMAC (MS_CMAC) is a variational CMAC neural network that is designed for modeling smooth functional mappings. The MS_CMAC learning strategy involves constructing virtual grid-distributed data points from random-distributed training data points, and then using the virtual data points to train a tree structure network that is composed of one-dimensional CMAC nodes. A disadvantage of the MS_CMAC is that the prediction errors near the boundary area might sometimes be unexpectedly large. Another disadvantage of the MS_CMAC is that generating virtual grid-distributed data points gener-ally takes a long computational time. Therefore, this study develops an improved model by integrating an unsupervised fuzzy neural network (UFN) into the MS_CMAC to initialize systematically the virtual grid-distributed data points. Additionally, a new error feedback ratio function is adopted to speed up the MS_CMAC training. Several numerical problems are considered to test the improved MS_CMAC. The computed results indicate that a sim-plified UFN model can produce good initial values of the virtual grid-distributed data points to aggrandize MS_CMAC training. The MS_CMAC prediction is also improved by using the initialized virtual grid-distributed data points.

Keywords Virtual grid-distributed data points· A tree structure network · UFN 1 Introduction

The CMAC (cerebellar model articulation controller) [1] is a fast-learning artificial neural network. It is used mainly in the control domain [2,3] and also applied to other fields such as civil engineering and web searching recently [4–6]. The MS_CMAC (macro structure

J.-C. Jan (

B

)

Department of Computer Science and Information Engineering, Ching Yun University, 229 Chien-Hsin Road, Jungli, 320 Taiwan, R.O.C.

e-mail: [email protected] S.-L. Hung

(2)

CMAC) neural network [4] is a variational/modular CMAC model that is designed for mod-eling smooth functional mappings. The feature of the MS_CMAC is a tree structure network composed of one-dimensional CMAC nodes, and the MS_CMAC implements a dimensional reduction technique to substantially decrease the number of training instances required.

The principle of CMAC is simple. Each input pattern is assigned to a certain amount of weights; the CMAC output is the sum of the assigned weights; and the weights of the neighborhood input patterns mostly overlap. The initial version of CMAC adopts a constant basis function to perform locally weighted approximations of functions. According to the constant basis function, the prediction error of a training instance is evenly distributed to the assigned weights, and the CMAC output is generally a linear approximation in a local zone. Studies of CMAC have focused on developing high-order CMAC learning algorithms to improve CMAC prediction especially for the smooth functional mapping problems. For example, Lane et al. [7] proposed using a B-Spline receptive field function to replace the constant basis function, and Chiang and Lin [8] applied a Gaussian function as the basis function for the CMAC.

Developing a modular CMAC model is another strategy for improving CMAC prediction for smooth functional mapping problems. For example, Lin and Li [9] proposed a two-level tree structure network that was composed of small CMACs, and Hung and Jan [4] developed a tree structure network that was composed of one-dimensional CMAC nodes. Although the two modular CMAC models use the similar network structures, the advantages of the two CMAC models are entirely different. Lin and Li’s CMAC model emphasizes the use of a small computer memory to solve high-dimensional problems because the other CMAC models for solving the problems generally require an extremely large computer memory. Hung and Jan’s CMAC model, MS_CMAC, focuses on reducing the number of training instances required. The principle of the MS_CMAC is to decompose a multi-variable problem into several sim-ple one-variable sub-problems. A set of constant space grid-distributed data points enable the MS_CMAC to use a quadratic weight transforming scheme [10] to generate smoothed weight functions under a single learning cycle.

The main advantage of the MS_CMAC is that it uses only a few training instances to predict with high accuracy because of to the dimensional reduction technique and the grid-distrib-uted training data points. However, the grid-distribgrid-distrib-uted training data points are not always available in practical applications. To overcome this flaw, Jan et al. [11] developed an inverse training scheme to supplement the MS_CMAC to construct virtual constant space grid-dis-tributed data points from random-disgrid-dis-tributed training instances. The essentiality of the inverse training scheme is an optimization approach that uses a first-order heuristic method to update the virtual grid-distributed data points for minimizing the prediction errors of the random-distributed training instances. Jan et al. further indicated that a ratio 3:1 of random-random-distributed training instances to virtual grid-distributed data points is the minimum requirement for the inverse training scheme.

The training algorithm of MS_CMAC has greatly advanced recently [10,11]. However, MS_CMAC applications suffer from several shortcoming. First, the prediction errors near the boundary area might sometimes be unexpectedly large. Second, generating virtual grid-dis-tributed data points generally takes a long computational time. Therefore, this study develops an improved MS_CMAC neural network by integrating a simplified UFN model (unsuper-vised fuzzy neural network) into the MS_CMAC. Additionally, a new error feedback ratio function is adopted to speed up MS_CMAC training. In the new model, the simplified UFN model mainly produces initial values of virtual grid-distributed data points, and the new error feedback ratio function is used to replace the linear error feedback ratio function in the original MS_CMAC. In the following sections, Sect.2gives a brief introduction for

(3)

MS_CMAC, Sect.3explains the improved MS_CMAC, Sect.4adopts several numerical cases to test the learning performance of the improved MS_CMAC, and Sect.5draws con-clusions.

2 Review MS_CMAC 2.1 Tree Structure Network

This section uses an example to explain the computation of the MS_CMAC. Figure1shows a grid system with 3× 4 grid-distributed data points. In Fig.1, the circles denote virtual data points, the rectangles indicate temporary data points, and the triangle represents a tar-get data point. First, three one-dimensional CMACs are utilized to model three x-direc-tion line domains (i.e., F(x, y1), F(x, y2) and F(x, y3)) by the virtual data points. Second,

the corresponding outputs of the three temporary data points are calculated by the three one-dimensional CMACs. Third, another one-dimensional CMAC is utilized to model a

y-direction line domain(F(x0, y)) by the temporary data points. Fourth, the

correspond-ing output of the target data point is calculated by the last one-dimensional CMAC. The aforementioned computation can be accomplished using a two-level tree structure network composed of four one-dimensional CMAC nodes (root and three leaves), and is shown in Fig.2. The root CMAC is adopted to model F(x0, y), and leaf CMACs are adopted to model

F(x, y1), F(x, y2), F(x, y3). Therefore, y-variable is called the active parameter of the root

CMAC, and the x-variable is called the active parameter of the leaf CMAC. The grid-dis-tributed data points are used as training instances for leaf CMACs; the temporary data points are used as training and testing instances for root CMAC and leaf CMACs, respectively; and the target data point is used as testing instance for root CMAC. The modular CMAC begins from leaf CMACs and proceeds to root CMAC. Restated, the MS_CMAC is based on a grid data system. If the numbers of gridlines in n directions are given by p1, p2, …, and

pn, respectively, then the MS_CMAC is a n-level tree structure network that is composed ofn_j−1₌₁_ij₌₁pi

+ 1 one dimensional CMAC nodes, and each CMAC node in the ith

(1 ≤ i ≤ n − 1) level has pi children nodes. 2.2 Training

The MS_CMAC has two levels of weights. The first level weights are a set of virtual grid-distributed data points, and the second level weights are spline weight functions in CMAC nodes. An inverse training scheme is employed to generate the first level weights, and a qua-dratic weight transforming scheme is adopted to calculate the second level weights. Notably, if the MS_CMAC is based on a constant space grid system, then the MS_CMAC predictions are smooth everywhere.

2.2.1 Inverse Training Scheme

The inverse training scheme is employed to generate the constant space virtual grid-dis-tributed data points (first level weights) from the random-disgrid-dis-tributed training data points. Basically, the grid-system can be arbitrarily determined by users. However, the ratio of random-distributed training data points to virtual data points had better exceed three. The computation of the inverse training is similar to an optimization computing, where a

(4)

Grid-distributed data points

Temporary data points

Target data points

(x1,y1) (x1,y1) (x3,y1) (x4,y1) (x0,y0) (x0,y2) (x3,y2) (x4,y2) (x1,y3) (x2,y3) (x3,y3) (x4,y3) (x0,y1) (x2,y1) (x2,y3) F(x,y3) F(x,y2) F(x,y1) x1 x2 x0 x3 x4 F(x0,y) y3 y2 y1 y0

Fig. 1 Illustrate dimensional reduction technique of MS_CMAC

first-order heuristic method is adopted to update the virtual data points for minimizing the prediction errors of random-distributed training data points. The heuristic method as-sumes that a positive/negative error in the random-distributed training instance indicates a positive/negative updating for near virtual data points, and the error feedback ratios are lin-early inverse proportion to the distance between the training instances and the virtual data points. Therefore, the virtual grid-distributed data points are updated by following equations

GY_i(n+1)= GY_i(n)+ α(GXi, TXj) × (T Yj− Yc) (1) α(GXi, TXj) = 1−Di s(GXi,TXj) rmax if Di s(GXi, TXj) < rmax 0 else (2)

(5)

Input flow Output flow F(x0,y) (x1,y1) (x2,y1) (x3,y1) (x4,y1) (x1,y2) (x2,y2) (x3,y2) (x4,y2) (x1,y3) (x2,y3) (x3,y3) (x4,y3) (x0,y1) (x0,y2) (x0,y3) (x0,y0) f(x0,y0) F(x,y1) F(x,y2) F(x,y3) Start End

(Training data points) (Temporary data points)

(Target data points)

Fig. 2 Two-level tree structure of MS_CMAC for 3× 4 grid system

Di s(GXi, TXj) =

(gx1− tx1)2+ (gx2− tx2)2+ · · · (3)

where GY_i(n)is the value of a virtual grid-state data point at the nth training iteration, GXi is the position vector of the i th virtual data point, TXj is the position vector of j th train-ing instance, T Yj is the corresponding value of TXj, Yc is the corresponding output of MS_CMAC, α(GXi, TX) is a first-order error feedback ratio function, gx1, gx2, … are

elements of GXi, t x1, tx2, . . . are elements of TXj, and rmaxis a threshold value defined as

rmax= n _g 2 2 .

2.2.2 Quadratic Weight Transforming Scheme

The quadratic weight transforming scheme [10] is a specific learning algorithm for one-dimensional CMAC nodes of the MS_CMAC. The output formula of an original one-dimen-sional CMAC is the summation of g connective weights, and is defined as follows

y(xi) = g−1 j=0

wi+ j (4)

where y is the output function, xi is the input variable,wi+ j is the addressing weight, and g is an integer which is called the generalization size. In each one-dimensional CMAC

node, the Hanmin distance between two connective virtual training data points is set to g, which also represents the data point density. If a training data points (xk, yk) is given for a one-dimensional CMAC node, then the addressing weights can be calculated as following

w(k−1)×g+ j = yk

(6)

Equations (4) and (5) produce a step-state weight function and a multi-linear approximation for the one-dimensional CMAC node. To smoothen CMAC outputs, a set of quadratic splines ( fk(i) = aki2+ bki+ ck) is proposed to replace the step-state weight function by satisfying the following equations

k×g

(k−1)×g fk(i)di = yk (6)

fk−1(xk) = fk(xk) (7)

f_k₋₁(xk) = fk(xk) (8) where xkis the location of a virtual training data point, fk(i) is the first-order differential of fk(i), Eq. (6) ensures that the CMAC prediction and virtual training data points fit well, and Eqs. (7) and (8) ensure that the CMAC prediction is smooth. As a result, the coefficients of the spline functions can be determined using linear algebra.

After the weight transformation is finished, a more complicated output formula that resem-bled to the numerical integration method of Simpon’s rule is proposed to replace the original output formula. The new formula is defined as follows

y(xi) = 1 3 g−2 j=0, j= j+2 f(xi+ j) + 4 f (xi+1+ j) + f (xi+2+ j) (9) 3 Improved MS_CMAC

3.1 Influence of the Initial First Level Weights

In most neural networks, good network weights are generally difficult to define. However, the MS_CMAC does not have the problem. The first level weights of the MS_CMAC are the desired values that corresponded to the location of the grid-distributed data points. The second weights of the MS_CMAC are calculated from the first level weighs. Restated, the MS_CMAC has only a set of optimal weights when it handles a problem. A simple test is per-formed to elucidate the above opinion. Table1shows the learning results of an MS_CMAC that model a three-variable function(sin(x) sin(2y) sin(3z)) using three sets (A, B and C) of initial grid-distributed data points, where 648 random-distributed training data points and a 6-6-6 grid system (216 virtual grid-distributed data points) with constant spaces are used. The A set of initial grid-distributed data points is calculated by adding−25% to 25% variances to the desired values, the B set of initial grid-distributed data points is calculated by adding

−50% to 50% variances to the desired values, and the C set of initial grid-distributed data

points is calculated by adding−100% to 100% variances to the desired values. As indicated from Table1, the correlation coefficients that link the initial first level weights to the desired values are 0.8821, 0.7181 and 0.404 for the A, B and C sets of data, respectively. After the inverse training, the three correlation coefficients are raised to 0.9869, 0.9785, and 0.9207. The results reveal that the MS_CMAC training is better as the initial values of the virtual grid-distributed data points are closer to the desired values. Moreover, the numbers of learn-ing cycle are 92, 148, and 248 by for the A, B and C sets of data, respectively. The good

(7)

Table 1 Learning result of an MS_CMAC to model a function by using three sets of initial grid-distributed

data points

Initial values A set B set C set

Correlation coefficients that link the Before inverse training 0.8821 0.7181 0.4040 first level weight to the desired values

After inverse training 0.9869 0.9785 0.9207

Number of learning cycle 92 148 248

RMSE in learning phase Before inverse training 0.1127 0.2134 0.4421 After inverse training 0.0121 0.0125 0.0129

Note: RMSE = Root Mean Square Error

initial virtual grid-distributed data points also reduce the computational time in MS_CMAC training.

3.2 Improved Model

3.2.1 UFN Model

The original MS_CMAC generally randomly initialize its first level weights. The randomly generated values, however, almost certainly cannot be close to the desired values. Therefore, a method for systematically initializing the first level weights of the MS_CMAC is worth developing. Herein, a simplified UFN model (unsupervised neural network) is proposed for the MS_CMAC to initialize systematically virtual grid-distributed data points. The UFN can be considered as a nonlinear interpolation approach which uses local information near the target point to produce a prediction [12]. The prediction phase of the UFN is very simple. Basically, the UFN uses the following three steps to generate predictions: (1) find past cases that resemble the new problem by applying a competing algorithm; (2) establish a fuzzy set to represent the relationship between the similar cases and the new problem by applying a fuzzy membership function; and (3) generate a prediction based on the fuzzy set by applying a defuzzification formula. Additionally, the UFN model has a self-organized learning algorithm that selects systematically its working parameters. The self-organized learning algorithm has two parts. One part is called the weight adjusting process which uses a mathematical opti-mization algorithm to refine the weights in the competing algorithm. The other part is called the correlation analysis which uses a statistic method to calculate the threshold values of the fuzzy membership function. In this study, the weight adjusting process is not considered because the performance of the weight adjusting process depends on the high density of the training data.

3.2.2 Flow Chart of the Improved MS_CMAC

Figure3shows the flow chart of the improved MS_CMAC which is established by integrating a simplified UFN with an original type of MS_CMAC. The gray boxes are the new processes, which are described as follows.

Competing process identifies the degree of difference between the virtual

grid-distrib-uted data points and the training instances. The UFN uses a weighted Euclidean distance as index to identify the degree of difference between two data points. However, the improved MS_CMAC simply uses the Euclidean distance as index of the degree of difference. The

(8)

Start Prediction/Training End Training Prediction Training instances Inverse training Virtual grid-distributed data points Initial grid-distributed data points Competing process Correlation analysis MS_CMAC prediction Fuzzy process Defuzzification process Rmax New process Original process Control flow Data flow

Fig. 3 Flow chart of the improved MS_CMAC

reason is that the weighted Euclidean distance is difficult to define when an MS_CMAC handle the low density of the training data. The formula for the Euclidean distance between a virtual data point and a training data point is defined as follows

di j =G Xi− T Xj = k

(gxk− txk)2 (10) where di j denotes the difference degree between a grid data point and a training instance, and other symbols have defined in the previous section. In principle, small di jindicates the sign of high similarity.

(9)

Fuzzy process generates fuzzy sets to represent the relationships between grid-distributed

data points and training instances. The UFN uses a quasi-Z type membership function in its fuzzy process. In this study, the improved MS_CMAC still uses the quasi-Z membership function. The quasi-Z membership function is defined as follows

µ(di j, Rmax, Rmin) = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ 0 if di j> Rmax

RmaxRmin− Rmindi j (Rmax− Rmin)di j

if Rmax≥ di j≥ Rmin

1 if Rmin> di j

(11)

where Rminand Rmaxare the bounds of the fuzzy membership function. Rminis generally a

small value to avoid computer run-off error, and Rmaxcan be calculated by the self-organized

learning.

Defuzzification process initializes grid-distributed data points (GYi(0)) by synthesizing the output of the training instances (TYj) according to the fuzzy sets that are calculated from the fuzzy process. The UFN proposed the center of gravity method, COG, and mean of max-ima method, MOM, for different situations. The improved MS_CMAC uses only the COG method because of its simplicity. The COG method is defined as follows

GY_i(0)= jµ(di j)T Yj jµ(di j) (12)

Correlation analysis an assistant process that is used to determine systematically the value

of Rmaxin Eq. (11). First, a two-column matrix(A) that is composed of all possible

combi-nation of output pairs from training instances is constructed. Each row vector of the matrix

A should satisfy the following constraint

(A[k, 1], A[k, 2]) = (T Yi, T Yj) for di j< t and i < j

where t is an arbitrary real number. Then, the correlation coefficient between column 1 and column 2 of the matrix A is calculated. The computational formula is defined as follows

Cor(A(t)) =

k(A[k, 1] − A1)(A[k, 2] − A2)

n SA1SA2

(13)

where SA1and SA2are the standard errors of column 1 and column 2 of matrix A, respectively, and A1and A2are the average values of column 1 and column 2 of matrix A, respectively.

Finally, a line search algorithm is adopted to find the value of Rmaxthat satisfies the following

constraint equation

Cor(A(Rmax)) = 0.8 (14)

3.2.3 Second-order Error Feedback Ratio Function

The original type MS_CMAC uses a heuristic method to construct virtual grid-distributed data points, and the heuristic method is based on a linear error feedback ratio function. For further speeding up the inverse training of MS_CMAC, a second-order error feedback ratio function is proposed to replace Eq. (2), and is defined as follows

α(GXi, TXj) = 1− Di s(GXi,TXj) rmax 2 if Di s(GXi, TXj) < rmax 0 else (15)

(10)

Basically, it is difficult to demonstrate that the second-order error feedback ratio function has better convergence capability than the linear error feedback ratio function by mathematical method. In the following section, an attempt is made to use the test results to explain the convergence capability of the second-order error feedback ratio function.

4 Numerical Case Studies

This section discusses three new types (I, II and III) of MS_CMAC for modeling smooth functional mapping problems, and uses the original MS_CMAC as a reference model. Type I model uses the randomly initialized first-level weights and a second-order error feedback ratio function in its training phases, the type II model implements the systematically initial-ized first-level weights by UFN and the original error feedback ratio function in its training phases, and the type III model uses the systematically initialized first-level weights and the second-order error feedback ratio function in its training phases.

4.1 Case I

This case is a three-variable smooth function, which has been discussed in the authors’ previous research [10,11]. The three-variable function is defined as follows

F1(x1, x2, x3) = sin(x1π) sin(2x2π) sin(3x3π) for

0≤ x1, x2, x3≤ 1

A grid system with five constant grid spaces in each direction was selected for an MS_CMAC to model the function, and the gridline locations in each direction were defined as follows

x1, x2, x3: {0.0, 0.2, 0.4, 0.6, 0.8, 1.0}

Therefore, 216 (6 × 6 × 6) virtual grid-distributed data points were necessary for the MS_CMAC prediction. According to the grid system, a three-level tree structure network with six branches in each level was adopted for the MS_CMAC to model the function. The

active parameters for level 1–3 were set as x1, x2, and x3, respectively, and the generalization

sizes, g, were set to 40 for all one-dimensional CMAC nodes. As a result, the learning domain was decomposed into 8× 106 ₍₄₀3_{× 5 × 5 × 5) data points by the MS_CMAC.}

About 648 random-distributed data points were selected as training instance for the inverse training of MS_CMAC to construct 216 virtual grid-distributed data points. Another 1,000 random-distributed data points were selected as testing instances for verifying the improved MS_CMACs. Table2shows the computing results in case I. As indicated from Table2, the correlation coefficient that links the initial first level weights (virtual grid-distributed data points) to their desired values is−0.0063 in the original model and type I MS_CMAC. How-ever, the correlation coefficient that links the initial first level weights to their desired values is increased to 0.8467 in type II and III MS_CMAC. Comparing the learning speeds of the four type MS_CMACs, the learning is converged after 210, 161, 105 and 74 learning cycles in the original model, the type I, II, and III, respectively. The results indicate that the second-order error feedback ratio function substantially reduces the learning time of the inverse training of the MS_CMAC, independently of how the first level weights are initialized. Moreover, the initialized first level weights that are systematically generated by the UFN also accelerate the learning convergence of MS_CMAC. Comparing the prediction accuracies of the four types MS_CMAC, the prediction errors are 0.0581, 0.0587, 0.0190 and 0.0178 in original

(11)

Table 2 Computing results of four types of MS_CMAC in case I

Types Original model I II III

Correlation coefficients Before inverse −0.0063 −0.0063 0.8467 0.8467 that link the first level training

weight to the desired values

After inverse 0.9053 0.9138 0.9921 0.9932 training

Number of learning cycle/learning 210/29 s (–) 161/23 s (23.3%) 105/15 s (50%) 74/10 s (64.8%) time (time reducing)

RMSE in learning (improvement) 0.0167 (–) 0.0161 (3.6%) 0.0117 (29.9%) 0.0114 (31.7%) RMSE in prediction (improvement) 0.0581 (–) 0.0587 (−1.0%) 0.0190 (67.3%) 0.0178 (69.4%)

-0.8 -0.4 0.0 0.4 0.8 0.0 0.1 0.2 0.3 0.4 0.5

Distance from a data point to the nearest boundary

r or re n oit ci de r P Original model New type III

Fig. 4 Comparing the prediction errors of case I predicted by the original model and new type III MS_CMAC

model, type I, II, and III, respectively. The results reveal that the prediction improvement made by the MS_CMAC depends mainly on the systematically initialized first level weights. Furthermore, Fig.4shows the prediction errors of the 1,000 testing instances, where the vertical-axis denotes the prediction error and the horizontal-axis represent the distance from a testing instance to its nearest boundary of the learning domain. Clearly, the predictions of the testing instances near the boundary area are greatly improved.

4.2 Case II

This case is a four-variable smooth function, and has been discussed in references [9,11]. The four-variable function is defined as follows

F2(x1, x2, x3, x4) = x1+ sin(x1π) cos(x2π) sin(3x3π)sin2(x4π) − 1

(12)

Table 3 Computing results of improved MS_CMAC in case II

Correlation coefficients that link the first level weight to the desired values Before Inverse training 0.0351 0.0351 0.8334 0.8334 After inverse training 0.9650 0.9670 0.9887 0.9899

Number of learning cycle/learning 326/ 202 s (–) 261/ 423 s 196/ 318 s 144/ 233 s

time(time reducing) (19.9%) (39.9%) (55.8%)

RMSE in learning (improvement) 0.0142 (–) 0.0135 (4.9%) 0.0131 (7.7%) 0.0126 (11.3%) RMSE in prediction (improvement) 0.0474 (–) 0.0429 (9.5%) 0.0309 (34.8%) 0.0280 (40.9%)

A grid system with four constant grid spaces in the x1and x3directions, five constant grid

spaces in the x2direction, and eight constant grid spaces in the x4direction was selected for

the MS_CMAC to model the function. The gridline locations in each direction were defined as follows

x1, x3: {−1, −0.5, 0, 0.5, 1}

x2: {−1, −0.6, −0.2, 0.2, 0.6, 1}

x4: {0, −0.75, −0.5, −0.25, 0, 0.25, 0.5, 0.75, 1}

Therefore, 1,350(5 × 6 × 5 × 9) virtual grid-distributed data points were necessary for the MS_CMAC prediction. According to the grid system, a four-level tree structure network was employed for MS_CMAC to model the function. The active parameters were set to

x1, x2, x3and x4for level 1, 2, 3 and 4, respectively. Thus, the numbers of branches from

level 1 to 3 were five, six, and five, respectively, and the generalization sizes(g) were set to 40 for all one-dimensional CMAC nodes. As a result, the learning domain is decomposed into 1.6384 × 109(404× 4 × 5 × 4 × 8) data points by the MS_CMAC.

About 4,050 random-distributed data points were selected as training instance for the inverse training of MS_CMAC to construct 1,350 virtual grid-distributed data points. Another 5,000 random-distributed data points were selected as testing instances for verifying the improved MS_CMACs. Table3shows the computed results in the case II. As indicated from Table3, the correlation coefficient that links the initial first level weights (virtual grid-distrib-uted data points) to their desired values is 0.0351 in the original model and type I MS_CMAC. However, the correlation coefficient that links the initial first level weights to their desired values are increased to 0.8334 in type II and III MS_CMAC. Comparing the learning speeds of the four type MS_CMACs, the learning is converged after 326, 261, 196 and 144 learning cycles in original model, type I, II, and III, respectively. Comparing the prediction accuracies of the four types MS_CMAC, the prediction errors are 0.0474, 0.0429, 0.0309 and 0.028 in original model, type I, II, and III, respectively. Furthermore, Fig.5shows the prediction errors of the 5,000 testing instances, where the vertical-axis denotes the prediction error and the horizontal-axis represent the distance from a testing instance to its nearest boundary of the learning domain. Clearly, the predictions of the testing instances near the boundary area are greatly improved.

(13)

-1.0 -0.5 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0

P

rediction error

Original model New type III

Fig. 5 Comparing the prediction errors of case II predicted by the original model and new type III MS_CMAC

4.3 Case III

This case is a four-variable continuous function, and has been also discussed in references [9,12]. The four-variable function is defined as follows

F3(x1, x2, x3, x4)= (ln (x1x2+ x3x4) + ln (x1x3+ x2x4))2 for 0.1 ≤ x1, x2, x3, x4≤ 3.1

A grid system with four constant grid spaces in each direction was selected for the MS_CMAC to model the function, and the locations of gridlines in each direction are defined as follows

x1, x2, x3, x4: {0.1, 0.85, 1.6, 2.35, 3.1}

Therefore, 625(5 × 5 × 5 × 5) virtual grid-distributed data points were necessary for the MS_CMAC prediction. According to the grid system, a four-level tree structure network with five branches in each level was employed for MS_CMAC. The active parameters for level 1–4 were set to x1, x2, x3and x4, respectively, and the generalization sizes, g, were set to 40

for all one-dimensional CMAC nodes. As a result, the learning domain is decomposed into 6.5336 × 108₍₄₀4_{× 4 × 4 × 4 × 4) data points by the MS_CMAC.}

About 1,875 random-distributed data points were selected as training instance for the inverse training of MS_CMAC to construct 625 virtual grid-distributed data points. Another 2,000 random-distributed data points were selected as testing instances for verifying the improved MS_CMACs. Table4shows the computed results in the case III. As indicated from Table4, the correlation coefficient that links the initial first level weights (virtual grid-distributed data points) to their desired values is 0.0478 in the original model and type I MS_CMAC. However, the correlation coefficient that links the initial first level weights to their desired values are increased to 0.971 in type II and III MS_CMAC. Comparing the learn-ing speeds of the four type MS_CMACs, the learnlearn-ing is converged after 318, 251, 102 and 77 learning cycles in original model, type I, II, and III, respectively. Comparing the prediction accuracies of the four types MS_CMAC, the prediction errors are 0.1393, 0.1380, 0.1194 and 0.1169 in original model, type I, II, and III, respectively. Furthermore, Fig.6shows the prediction errors of the 5,000 testing instances, where vertical-axis denotes prediction error

(14)

Table 4 Computing results of improved MS_CMAC in case III

Correlation coefficient of the first level weight to the desired values

Before inverse training 0.0478 0.0478 0.9710 0.9710 After inverse training 0.9490 0.9490 0.9743 0.9742

Number of learning cycle/learning time 318/ 1822 s (–) 251/ 1438 s 102/ 584 s 77/ 441 s

(time reducing) (21.1%) (67.9%) (75.8%)

RMSE in learning (improvement) 0.1393 (–) 0.1380 (9.3%) 0.1194 (14.3%) 0.1169 (16.1%) RMSE in prediction (improvement) 0.4461 (–) 0.4111 (7.8%) 0.2849 (36.1%) 0.2759 (38.2%)

-8 -4 0 4 8 0.0 0.4 0.8 1.2 1.6

r or re n oit ci de r P Original model New type III

Fig. 6 Comparing the prediction errors of case III predicted by the original model and new type III MS_CMAC

and horizontal-axis represent the distance from a testing instance to its nearest boundary of the learning domain. Clearly, the predictions of testing instances near the boundary area are greatly improved.

5 Conclusion

This study integrates the simplified UFN model into the MS_CMAC to initialize systemat-ically the first level weights of MS_CMAC to improve the prediction accuracy and accelerate the learning convergence. A second-order error feedback ratio function is also proposed for the MS_CMAC to speed up the learning convergence. Three numerical cases were considered to test the improved MS_CMAC. The testing results reveal that the improved MS_CMAC has obviously improvement in the learning phase. The improved MS_CMAC has a 11.3–31.7% lower learning error and a 55.8–75.8% lower learning time than that of the original MS_CMAC in the testing cases. The testing results also show that the improved

(15)

MS_CMAC exhibits much greater predictive accuracy than that of the original MS_CMAC. The improved MS_CMAC has a 38.2–69.4% lower prediction error than that of the original MS_CMAC in the testing cases. Moreover, the improvement of MS_CMAC prediction is very clear when the testing instances located in the area near the boundary of the learning domain.

References

1. Albus JS (1975) A new approach to manipulator control: the cerebellar model articulation controller (CMAC). J Dyn Syst Meas Control Trans ASME 97(3):220–227

2. Shelton RO, Peterson JK (1992) Controlling a truck with an adaptive critic CMAC design. Simulation 58(5):319–326

3. Lin CM, Peng YF (2004) Adaptive CMAC-based supervisory control for uncertain nonlinear systems. IEEE Trans Syst Man Cybern B Cybern 34(2):1248–1260

4. Hung SL, Jan JC (1999) MS_CMAC neural network learning model in structural engineering. J Comput Civil Eng ASCE 13(1):1–11

5. Kim DH (2002) Cerebellar model articulation controller (CMAC) for suppression of structural vibration. J Comput Civil Eng ASCE 16(4):291–298

6. Chen CM (2004) Incremental personalized web page mining utilizing self-organizing HCMAC neural network. Web Intell Agent Syst 2(1):21–38

7. Lane SH, Handelman DA, Gelfand JJ (1992) Theory and development of higher-order CMAC neural networks. IEEE Control Syst Mag 12(2):23–30

8. Chiang CT, Lin CS (1996) CMAC with general basis functions. Neural Netw 9(7):1199–1211 9. Lin CS, Li CK (1999) A memory-based self-generated basis function neural network. Int J Neural Syst

9(1):41–59

10. Jan JC, Hung SL (2001) High-order MS_CMAC neural network. IEEE Trans Neural Netw 12(3):598– 603

11. Jan JC, Chen CM, Shiao LH (2006) Inverse training scheme for MS_CMAC to handle random data. Neurocomputing 70(1–3):502–512