Proposed Weight Updating Rule - 自我組織特徵映射網路於最佳化問題及其應用

Figure 3.3 shows the structure and operation of the SOM in the SOMS. The SOM performs two operations: evaluation and search. In Figure 3.3, each neuron j in the SOM contains a vector of a possible solution set w_j (the weight vector). Each time new measured data v are sent into the scheme, the SOM is triggered to operate. All of the possible solution sets in the neurons will then be sent to the dynamic model to derive their corresponding data p_j. The SOM evaluates the difference between v and each p_j. Of all the neurons, it chooses the neuron j^∗, which corresponds to the smallest difference, as the winner. The learning process then continues, and the network will eventually converge to the optimal solution. And even when the optimal solution is not within the estimated range for some cases, the search mechanism is still expected to move the possible candidates out of their

initial locations and guide them to converge to the optimal solution.

The main purpose of the proposed SOMS is how to explore and exploit the search space and to obtain an optimal solution for the optimization problem and, furthermore, to make the variations of the weights as organized movements. To this purpose, the SOMS learns to organized and efficient search, but not random search. For effective weight updating in the SOM, the topological neighborhood function and learning rate need to be properly determined. Their determination may depend on the properties of the system parameters to learn. As mentioned above, system parameters may operate in quite different working ranges. To achieve high learning efficiency, the weight updating should be executed on an individual basis, instead of using the same neighborhood function for all the parameters.

We thus propose a new SOM weight updating rule which can dynamically adjust the center and width of their respective neighborhood function for the SOM in learning each of the system parameters.

For the topological mapping, unlike in the traditional SOM applications, it is now our aim to let the weight vectors form the uniform distribution like the pre-ordered lattice in the neuron space. Generally the neuron space and weight vector space are with different dimensions, so we have to transfer them into the same one. The Gaussian type function is usually used as the neighborhood function, and it is differentiable and continuous. We also used the Gaussian type functions as the neighborhood functions in the neuron space and weight vector space. With the neighborhood functions, the magnitudes of their respective distances in lattice space and in weight vector space can be normalized to be between 0 and 1. The proposed weight updating rule is designed to first let the weight vectors approach the vicinity of the optimal solution set when it falls outside the coverage of the SOM. The weight vector cluster is then moved to the center of the SOM. The process will

Dynamic model

Figure 3.2 Proposed SOM-based algorithm for optimization.

Figure 3.3 Structure and operation of the SOM in the SOMS.

Evaluation

continue until the solution set falls within the SOM. Later, the rule will make the weight vectors converge to a more and more compact cluster centering at the optimal solution.

We then define two Gaussian neighborhood functions, Dj and F (w_j(k)) in the kth stage of learning as

where r_j and r_j^∗ stand for the coordinates of neuron j to entire network and j^∗, respec-tively, σdthe standard deviation of the distribution for Dj, and σi the standard deviation of the distribution for w_j,i(k). Note that F ^³w_j(k)^´ is defined by considering the effects from all q elements in w_j(k). Here, D_j is used as a reference distribution for F^³w_j(k)^´. In other words, We intend to map the magnitude difference of the parameter into the neurons spaces. To make F (w_j(k)) approach Dj, an error function Ej(k) is then defined as

E_j(k) = 1

2(D_j − F (w_j(k)))². (3.3)

During the learning, we can find that when wj^∗,i(k) is much different from wei(k), the average of all w_j,i(k), the optimal solution is possibly located far outside the estimated range; contrarily, when w_j^∗_,i(k) is close to w_e_i(k), the optimal solution is possibly within the estimated range. Based on this observation, we proposed a method to speed up the learning. For illustration, we define a Gaussian distribution function G(w_j,i(k)) for each element w (k), ith element in w (k)) in the kth stage of learning:

G(w_j,i(k)) = exp(−(w_j,i(k) − w_e_i(k))²

2σ_i² ) (3.4)

The strategy is to vary the mean and variance of G(w_j,i(k)) by moving its center to where w_j^∗_,i(k) is located and enlarging (reducing) the variance σ_i² to be ˜σ_i² = |w_j^∗_,i(k) − w_e_i(k)|², where | · | stands for the absolute value, as illustrated in Figure 3.4. The new distribution function ˜G( ˜wj,i(k)) is then formulated as

G( ˜˜ wj,i(k)) = exp(−( ˜wj,i(k) − wj^∗,i(k))²

2˜σ_i² ) = exp(−(wj,i(k) − wei(k))²

2σ_i² ) = G(wj,i(k)) (3.5)

where ˜wj,i(k) stands for the new wj,i(k) after the adjustment. As indicated in Figure 3.4, ˜G( ˜w_j,i(k)) is equal to G(w_j,i(k)) when w_j,i(k) varies to ˜w_j,i(k). From Eq.(3.5), during each iteration of learning, G(w_j,i(k)) is dynamically centered at the location of the winning neuron j^∗, with a larger (smaller) width when wei(k) is much (less) different from wj^∗,i(k).

It thus covers a more fitting neighborhood region, and leads to a higher learning efficiency.

With ˜G( ˜w_j,i(k)), the new weight ˜w_j,i(k) is derived as

w_j,i(k) = |w_j^∗_,i(k) − w_e_i(k)|

σ_i · (w_j,i(k) − w_e_i(k)) + w_j^∗_,i(k). (3.6)

And, with a desired new weight ˜wj,i(k), the learning should also make w_j(k) approach

w_j(k), in addition to minimizing the error function E_j(k) in Eq.(3.3). A new error function E˜_j(k) is thus defined as

(a) w^e_i!w_{j i}^*,( )k

!

²" _i²

E˜_j(k) = 1

2[(D_j− F (w_j(k)))² +^°^°°w_j(k) − ˜w_j(k)^°^°°²]. (3.7)

Based on Eq.(3.7) and the gradient-descent approach, the weight-updating rule is derived as

where η(k) stands for the learning rate in the kth stage of learning, described in chapter 1.

Together, the weight updating rule described in Eq.(3.8) and the learning rate in Eq.(2.2) will force the minimization of the difference between the weight vector of the winning neuron and those corresponding to every neuron in each learning cycle. The learning will eventually converge.

在文檔中自我組織特徵映射網路於最佳化問題及其應用 (頁 43-49)