• 沒有找到結果。

Proposed Weight Updating Rule

Figure 3.3 shows the structure and operation of the SOM in the SOMS. The SOM performs two operations: evaluation and search. In Figure 3.3, each neuron j in the SOM contains a vector of a possible solution set wj (the weight vector). Each time new measured data v are sent into the scheme, the SOM is triggered to operate. All of the possible solution sets in the neurons will then be sent to the dynamic model to derive their corresponding data pj. The SOM evaluates the difference between v and each pj. Of all the neurons, it chooses the neuron j, which corresponds to the smallest difference, as the winner. The learning process then continues, and the network will eventually converge to the optimal solution. And even when the optimal solution is not within the estimated range for some cases, the search mechanism is still expected to move the possible candidates out of their

initial locations and guide them to converge to the optimal solution.

The main purpose of the proposed SOMS is how to explore and exploit the search space and to obtain an optimal solution for the optimization problem and, furthermore, to make the variations of the weights as organized movements. To this purpose, the SOMS learns to organized and efficient search, but not random search. For effective weight updating in the SOM, the topological neighborhood function and learning rate need to be properly determined. Their determination may depend on the properties of the system parameters to learn. As mentioned above, system parameters may operate in quite different working ranges. To achieve high learning efficiency, the weight updating should be executed on an individual basis, instead of using the same neighborhood function for all the parameters.

We thus propose a new SOM weight updating rule which can dynamically adjust the center and width of their respective neighborhood function for the SOM in learning each of the system parameters.

For the topological mapping, unlike in the traditional SOM applications, it is now our aim to let the weight vectors form the uniform distribution like the pre-ordered lattice in the neuron space. Generally the neuron space and weight vector space are with different dimensions, so we have to transfer them into the same one. The Gaussian type function is usually used as the neighborhood function, and it is differentiable and continuous. We also used the Gaussian type functions as the neighborhood functions in the neuron space and weight vector space. With the neighborhood functions, the magnitudes of their respective distances in lattice space and in weight vector space can be normalized to be between 0 and 1. The proposed weight updating rule is designed to first let the weight vectors approach the vicinity of the optimal solution set when it falls outside the coverage of the SOM. The weight vector cluster is then moved to the center of the SOM. The process will

Dynamic model

Figure 3.2 Proposed SOM-based algorithm for optimization.

Figure 3.3 Structure and operation of the SOM in the SOMS.

Evaluation

continue until the solution set falls within the SOM. Later, the rule will make the weight vectors converge to a more and more compact cluster centering at the optimal solution.

We then define two Gaussian neighborhood functions, Dj and F (wj(k)) in the kth stage of learning as

where rj and rj stand for the coordinates of neuron j to entire network and j, respec-tively, σdthe standard deviation of the distribution for Dj, and σi the standard deviation of the distribution for wj,i(k). Note that F ³wj(k)´ is defined by considering the effects from all q elements in wj(k). Here, Dj is used as a reference distribution for F³wj(k)´. In other words, We intend to map the magnitude difference of the parameter into the neurons spaces. To make F (wj(k)) approach Dj, an error function Ej(k) is then defined as

Ej(k) = 1

2(Dj − F (wj(k)))2. (3.3)

During the learning, we can find that when wj,i(k) is much different from wei(k), the average of all wj,i(k), the optimal solution is possibly located far outside the estimated range; contrarily, when wj,i(k) is close to wei(k), the optimal solution is possibly within the estimated range. Based on this observation, we proposed a method to speed up the learning. For illustration, we define a Gaussian distribution function G(wj,i(k)) for each element w (k), ith element in w (k)) in the kth stage of learning:

G(wj,i(k)) = exp(−(wj,i(k) − wei(k))2

i2 ) (3.4)

The strategy is to vary the mean and variance of G(wj,i(k)) by moving its center to where wj,i(k) is located and enlarging (reducing) the variance σi2 to be ˜σi2 = |wj,i(k) − wei(k)|2, where | · | stands for the absolute value, as illustrated in Figure 3.4. The new distribution function ˜G( ˜wj,i(k)) is then formulated as

G( ˜˜ wj,i(k)) = exp(−( ˜wj,i(k) − wj,i(k))2

σi2 ) = exp(−(wj,i(k) − wei(k))2

i2 ) = G(wj,i(k)) (3.5)

where ˜wj,i(k) stands for the new wj,i(k) after the adjustment. As indicated in Figure 3.4, ˜G( ˜wj,i(k)) is equal to G(wj,i(k)) when wj,i(k) varies to ˜wj,i(k). From Eq.(3.5), during each iteration of learning, G(wj,i(k)) is dynamically centered at the location of the winning neuron j, with a larger (smaller) width when wei(k) is much (less) different from wj,i(k).

It thus covers a more fitting neighborhood region, and leads to a higher learning efficiency.

With ˜G( ˜wj,i(k)), the new weight ˜wj,i(k) is derived as

˜

wj,i(k) = |wj,i(k) − wei(k)|

σi · (wj,i(k) − wei(k)) + wj,i(k). (3.6)

And, with a desired new weight ˜wj,i(k), the learning should also make wj(k) approach

˜

wj(k), in addition to minimizing the error function Ej(k) in Eq.(3.3). A new error function E˜j(k) is thus defined as

(a) wei!wj i*,( )k

!

2" i2

E˜j(k) = 1

2[(Dj− F (wj(k)))2 +°°°wj(k) − ˜wj(k)°°°2]. (3.7)

Based on Eq.(3.7) and the gradient-descent approach, the weight-updating rule is derived as

where η(k) stands for the learning rate in the kth stage of learning, described in chapter 1.

Together, the weight updating rule described in Eq.(3.8) and the learning rate in Eq.(2.2) will force the minimization of the difference between the weight vector of the winning neuron and those corresponding to every neuron in each learning cycle. The learning will eventually converge.

相關文件