最近搜尋

沒有找到結果。

標籤

沒有找到結果。

文件

沒有找到結果。

上傳

首頁學校主題

登錄

Evaluating performance of the SAG-SEFA

在文檔中以改良安全性增強式學習為基礎的自我適應進化演算法應用於模糊類神經控制器設計之研究 (頁 116-123)

Chapter 5 Control Illustration

5.1 Inverted Pendulum Control System

5.1.3 Evaluating performance of the SAG-SEFA

The initial parameters of the proposed ISRL-SAG-SEFA in this example are determined by parameter exploration ([104]). The parameters set for the ISRL-SAG-SEFA are shown in Table 5.9. As shown in Table 5.9, the Minimum_Suppor used in FP-growth is set as half number of transactions according to [52]. Minimum_Suppor effects the number of frequent item sets (the number of suitable group sets). If Minimum_Suppor is too small, large number of frequent item sets will be generated, which cause the system unable to estimate superior group sets. On the contrary, when the Minimum_Suppor is too large, few frequent item sets will be generated, which causes the system unable to pick chromosomes in a sufficient amount of suitable group sets. After experimenting in this paper, and referring to [52], we can

Type I 51.54 18.23 83.21 31.12

Type II 45.93 16.41 76.87 27.39

Type III 58.43 22.18 98.91 38.55

Type IV 68.26 26.63 131.25

Type V 30.54 10.23 49.21 11.12

Type III 58.43 22.18 98.91 38.55

Type III 58.43 22.18 98.91 38.55

Type III 58.43 22.18 98.91 38.55

Type II 45.93 16.41 76.87 27.39

Type I 51.54 18.23 83.21 31.12

Type II 45.93 16.41 76.87 27.39

Type III 58.43 22.18 98.91 38.55

Type IV 68.26 26.63 131.25

Type V 30.54 10.23 49.21 11.12

Type IV 68.26 26.63 131.25

Type II 45.93 16.41 76.87 27.39

Type IV 68.26 26.63 131.25

Type III 58.43 22.18 98.91 38.55

Type IV 68.26 26.63 131.25

Type IV 68.26 26.63 131.25

Type III 58.43 22.18 98.91 38.55

Type II 45.93 16.41 76.87 27.39

Type II 45.93 16.41 76.87 27.39

Type IV 68.26 26.63 131.25

Type IV 68.26 26.63 131.25

Type II 45.93 16.41 76.87 27.39

Type II 45.93 16.41 76.87 27.39

Type IV 68.26 26.63 131.25

Type IV 68.26 26.63 131.25

Type IV 68.26 26.63 131.25

see that when Minimum_Suppor is set as half number of transactions, the performance is satisfactory.

Table 5. 9: The initial parameters of the ISRL-SAG-SEFA before training.

Parameters Value Parameters Value

Psize 16 Stable TimeSteps_ 5000

NC 10 Thres TimeStep_ 1000

Times

Selection _ 200 TSSAimes ³⁰

NormalTimes 10 A 10

SearchingTimes 15 λ 0.01

ExploringTimes 20 η 7

Crossover Rate 0.5 Generations 300

Mutation Rate 0.2 Minimum_Suppor TransactionNum/2

essvalue

ThreadFitn 550 Thres StableTimeSteps_ 500

The coding of a rule in a chromosome is the form given in Fig. 3.10. The values are floating-point numbers initially assigned using the ISRL-SAG-SEFA. A total of thirty runs were performed in this simulation. Each run started at the different initial state (θɺ and xɺ are set for 0, θ and x are set randomly according to the predefined ranges). The mean, best, and worst of the optimal number of rules by performing the TSSA from thirty runs is shown in Table 5.10.

Table 5. 10: The number of rules from thirty runs of the TSSA.

Method Mean Best Worst

ISRL-SAG-SEFA 5 3 10

The learning curve of the ISRL-SAG-SEFA after thirty runs is shown in Fig. 5.13. In this figure, there are thirty runs each run represents that how soon the TNFC can meet the goal state. When ISRL-SAG-SEFA is stopped, the best combination of strings from the groups in the final generation is selected and tested on the inverted pendulum control system. The successful results, which consist of the pendulum angle, angular velocity of the pendulum (in degrees/seconds), and the velocity of the cart (in meters/seconds) are shown in Fig. 5.14. Each floating-point numbers initially assigned using the ISRL-SAG-SEFA. A total of thirty runs were performed in this simulation. Each run started at the different initial state (

are set randomly according to the predefined range were performed in this simulation. Each run started

were performed in this simulation. Each run started floating-point numbers initially assigned using the floating-point numbers initially assigned using the

were performed in this simulation. Each run started at the different initial state ( were performed in this simulation. Each run started

are set randomly according to the predefined range were performed in this simulation. Each run started

were performed in this simulation. Each run started floating-point numbers initially assigned using the floating-point numbers initially assigned using the floating-point numbers initially assigned using the floating-point numbers initially assigned using the

line in Fig. 5.14 represents each run with a different initial state. The results shown in this figure are the first 1,000 of 6,000 control time steps (Thres TimeStep +_ Stable TimeSteps ). _ As shown in Fig. 5.14, the ISRL-SAG-SEFA successfully controlled the inverted pendulum control system in all thirty runs (the pendulum angle, pendulum angular velocity, and cart velocity decrease to 0).

As well as the ISRL-HEA and ISRL-SACG-SE, we select the best-trained individual of the proposed ISRL-SAG-SEFA in the training phase, and extend the control time steps to 100,000 in the testing phase. The simulation results, which consist of the pendulum angle, the pendulum angular velocity, and the cart velocity, are shown in Fig. 5.15. Each line in Fig.

5.15 represents the result of the last 1000 time steps in a run that starts from the different initial state. As shown in Fig. 5.15, the proposed ISRL-SAG-SEFA, each line can meet the control goal G1 and the pendulum is kept upright during the last 1000 time steps. Moreover, in the ISRL-SAG-SEFA, the percentage that the plant remains in G₁ during the last 1000 time steps is 100%. It’s obvious that the ISRL allows the pendulum angle, the pendulum angular velocity, and the cart velocity to swing a small range near zero and stabilize the control system.

Figure 5. 13: The learning curve of the ISRL-SAG-SEFA.

and the pendulum is kept upright during the last 1 the ISRL-SAG-SEFA, the percentage that the plant re

steps is 100%. It’s obvious that the ISRL allows the pendulum angle, the pendulum angular velocity, and the cart velocity to swing a small ra

the ISRL-SAG-SEFA, the percentage that the plant re the ISRL-SAG-SEFA, the percentage that the plant re

and the pendulum is kept upright during the last 1 the ISRL-SAG-SEFA, the percentage that the plant re

steps is 100%. It’s obvious that the ISRL allows th the ISRL-SAG-SEFA, the percentage that the plant re steps is 100%. It’s obvious that the ISRL allows th the ISRL-SAG-SEFA, the percentage that the plant re steps is 100%. It’s obvious that the ISRL allows th the ISRL-SAG-SEFA, the percentage that the plant re steps is 100%. It’s obvious that the ISRL allows th steps is 100%. It’s obvious that the ISRL allows th the ISRL-SAG-SEFA, the percentage that the plant re the ISRL-SAG-SEFA, the percentage that the plant re steps is 100%. It’s obvious that the ISRL allows th steps is 100%. It’s obvious that the ISRL allows th the ISRL-SAG-SEFA, the percentage that the plant re the ISRL-SAG-SEFA, the percentage that the plant re the ISRL-SAG-SEFA, the percentage that the plant re the ISRL-SAG-SEFA, the percentage that the plant re steps is 100%. It’s obvious that the ISRL allows th steps is 100%. It’s obvious that the ISRL allows th the ISRL-SAG-SEFA, the percentage that the plant re steps is 100%. It’s obvious that the ISRL allows th

(a) (b)

(c)

Figure 5. 14: Control results of the inverted pendulum control system using the ISRL-SAG-SEFA in Example 1 (first 1000 time). (a) Angle of the pendulum. (b) Angular velocity of the pendulum. (c) Velocity of the cart.

(a) (b) Figure 5. 14: Control results of the inverted pendu

Figure 5. 14: Control results of the inverted pendulum control system using the ISRL-SAG-SEFA in Example 1 (first 1000 time). (a) Angle of the pendulum. (b) Angular velocity of the pendulum. (c) Velo

(c)

Figure 5. 14: Control results of the inverted pendulum control system using the ISRL-SAG-SEFA in Example 1 (first 1000 time). (a) Angle of the pendu

Figure 5. 14: Control results of the inverted pendu Figure 5. 14: Control results of the inverted pendu Figure 5. 14: Control results of the inverted pendu Figure 5. 14: Control results of the inverted pendu Figure 5. 14: Control results of the inverted pendu Figure 5. 14: Control results of the inverted pendu Figure 5. 14: Control results of the inverted pendu

(c)

Figure 5. 15: Control results of the inverted pendulum control system using the ISRL-SAG-SEFA in Example 1 (last 1000 time). (a) Angle of the pendulum. (b) Angular velocity of the pendulum. (c) Velocity of the cart.

In this example, in order to demonstrate the effectiveness and efficiency of the proposed ISRL-SAG-SEFA, the R-SE ([29]) and R-GA ([26]) are also used to compare with ISRL-SAG-SEFA. As shown in Fig. 5.7 (d)-(i), the accuracy of the TNFC with the R-SE and R-GA that the pendulum does not swing outside the boundary after 6,000 time steps are 56%

and 54%. However, in the ISRL-SAG-SEFA, the accuracy of the TNFC success meet the control goal and keep the pendulum in 100,000 time steps is 100%. As shown in Fig. 5.7 and 5.15, the ISRL-SAG-SEFA can perform better than the R-SE and R-GA.

The accuracy and CPU time comparison of ISRL-SAG-SEFA, R-SE, and R-GA are shown in Table 5.11. The ISRL-SAG-SEFA needs less CPU time than R-SE and R-GA. The reason is that the ISRL adopt a strict restriction in earlier time steps and evaluate the control system by how soon the system can meet the control goal. Therefore, the individuals in ISRL-SAG-SEFA have the high performance and the system can reach and remain the control goal in the earlier time steps. About this, the CPU time of ISRL-SACG-SE is dramatic less than that of R-SE and R-GA.

Compare to SACG-SE, the SAG-SEFA can obtain smaller CPU times because of the SAG-SEFA not only considers both of cooperation and specialization but also selects suitable ISRL-SAG-SEFA. As shown in Fig. 5.7 (d)-(i), the ac

R-GA that the pendulum does not swing outside the b and 54%. However, in the ISRL-SAG-SEFA, the accurac R-GA that the pendulum does not swing outside the b R-GA that the pendulum does not swing outside the b R-GA that the pendulum does not swing outside the b ISRL-SAG-SEFA. As shown in Fig. 5.7 (d)-(i), the ac ISRL-SAG-SEFA. As shown in Fig. 5.7 (d)-(i), the ac R-GA that the pendulum does not swing outside the b and 54%. However, in the ISRL-SAG-SEFA, the accurac R-GA that the pendulum does not swing outside the b and 54%. However, in the ISRL-SAG-SEFA, the accurac R-GA that the pendulum does not swing outside the b R-GA that the pendulum does not swing outside the b ISRL-SAG-SEFA. As shown in Fig. 5.7 (d)-(i), the ac ISRL-SAG-SEFA. As shown in Fig. 5.7 (d)-(i), the ac ISRL-SAG-SEFA. As shown in Fig. 5.7 (d)-(i), the ac

and 54%. However, in the ISRL-SAG-SEFA, the accurac and 54%. However, in the ISRL-SAG-SEFA, the accurac

groups to perform selection and crossover by using data-mining method. As shown in Fig.

5.10 and 5.13, the learning curves of the SAG-SEFA converge more quickly than those of the SACG-SE. The worst, mean, best, and standard deviation of CPU time of the SACG-SE and SAG-SEFA are shown in Table 5.11. As shown in this table, the SAG-SEFA obtains small CPU time than the SACG-SE.

The [57], [96], [20], [43], [44], and [40] have been applied to the same control problem.

Their simulation results are listed in Table 5.11. Table 5.11 shows the accuracy and CPU time for the control model. The initial parameters of these methods ([57], [96], [20], [43], [44], and [40]) are determined according to Section 5.1.1. The control time steps for testing are extended to 100,000 time steps. As shown in Table 5.11, the proposed ISRL-SAG-SEFA is more feasible and effective when compared with other existing models ([29], [26], [57], [96], [20], [43], [44], and [40]). The advantages of the ISRL-SAG-SEFA can be listed as follows:

1. Using the TSSA, the ISRL-SAG-SEFA computes by probability the suitable number of fuzzy rules to avoid the flaw that the number of fuzzy rules has to be assigned in advance under different environments.

2. The ISRL enhances the stability of the control system by using the design of Lyapunov-based safe reinforcement learning. It has better capability to stabilize the plant under different initial states.

3. The Group-based symbiotic evolution can evaluate the solution locally.

4. The SAG-SEFA not only considers both of cooperation and specialization but also selects suitable groups to perform selection and crossover by using DMSS and DMCS.

[20], [43], [44], and [40]). The advantages of the ISRL-SAG-SEFA can be listed as follows:

1. Using the TSSA, the ISRL-SAG-SEFA computes by pr fuzzy rules to avoid the flaw that the number of fu 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr [20], [43], [44], and [40]). The advantages of the

1. Using the TSSA, the ISRL-SAG-SEFA computes by pr fuzzy rules to avoid the flaw that the number of fu 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr

fuzzy rules to avoid the flaw that the number of fu 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr

fuzzy rules to avoid the flaw that the number of fu 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr

fuzzy rules to avoid the flaw that the number of fu fuzzy rules to avoid the flaw that the number of fu 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr

fuzzy rules to avoid the flaw that the number of fu fuzzy rules to avoid the flaw that the number of fu 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr

fuzzy rules to avoid the flaw that the number of fu fuzzy rules to avoid the flaw that the number of fu 1. Using the TSSA, the ISRL-SAG-SEFA computes by pr

fuzzy rules to avoid the flaw that the number of fu

Table 5. 11: Performance comparison of various existing models in Example 1.

CPU time Method

Mean Best Worst Std.

Accuracy

GENITOR ([57]) 120.95 61.34 320.36 92.95 50%

SANE ([96]) 97.56 48.54 254.84 83.56 61%

R-GA ([26]) 89.83 34.85 192.93 69.94 54%

R-SE ([29]) 73.14 28.66 149.43 57.87 56%

TDGAR ([20]) 69.13 26.54 112.73 41.58 53%

ESP ([40]) 58.32 22.08 95.57 35.27 56%

ERDSE ([44]) 51.19 20.77 88.53 30.74 67%

CQGAF ([43]) 48.82 18.79 84.39 26.31 59%

ISRL-HEA 39.97 15.10 71.01 18.23 100%

ISRL-SACG-SE 30.54 10.23 49.21 11.12 100%

ISRL- SAG-SEFA 18.12 5.31 28.51 8.34 100%

To demonstrate the efficiency of the proposed TSSA, DMSS, and DMCS, in this example, the six different methods, the SAG-SEFA using only TSSA (Type I), SE (Type II), GSE (Type III), SAG-SEFA (Type IV), SAG-SEFA using only TSSA and DMSS (Type V), and SAG-SEFA using only TSSA and DMCS (Type VI), are used. In Type I method, the groups are selected randomly to construct TNFC with different numbers of rules and each group performs crossover strategy independently. In Type II method, the traditional symbiotic evolution is used. In Type III method, the group-based symbiotic evolution (GSE) is adopted with fixed number of rules. In Type IV method, SAG-SEFA uses the proposed TSSA, DMSS, and DMCS to perform structure and parameter learning. In Type V method, SAG-SEFA uses only the proposed TSSA and DMSS; therefore, the each group performs crossover strategy independently. In Type VI method, SAG-SEFA uses only the proposed TSSA and DMCS;

therefore, the groups are selected randomly to construct TNFC with different numbers of rules.

In the six methods, the parameters are set according to [104]. In Type II and III methods, we determine the number of fuzzy rules by executing Type II and III methods with fixed string length for each specification of the number of fuzzy rules and then compute the average of the example, the six different methods, the SAG-SEFA us

GSE (Type III), SAG-SEFA (Type IV), SAG-SEFA using and SAG-SEFA using only TSSA and DMCS (Type VI), ar GSE (Type III), SAG-SEFA (Type IV), SAG-SEFA using GSE (Type III), SAG-SEFA (Type IV), SAG-SEFA using example, the six different methods, the SAG-SEFA us example, the six different methods, the SAG-SEFA us GSE (Type III), SAG-SEFA (Type IV), SAG-SEFA using and SAG-SEFA using only TSSA and DMCS (Type VI), ar GSE (Type III), SAG-SEFA (Type IV), SAG-SEFA using GSE (Type III), SAG-SEFA (Type IV), SAG-SEFA using example, the six different methods, the SAG-SEFA us example, the six different methods, the SAG-SEFA us example, the six different methods, the SAG-SEFA us example, the six different methods, the SAG-SEFA us

CPU time. All the three are designed base on ISRL. The performance (accuracy and CPU time) compared of the six methods is shown in Table 5.12.

In Table 5.12, comparing Type II with Type III, the GSE outperform than SE because of the chromosomes that use to evaluate the solution locally can obtain better performance compared to systems of only one population be used to evaluate the solution. However, comparing Type I with Type III method, the Type I method needs few CPU time to balance the control system. The reason is that the TSSA can determine the suitable number of fuzzy rules automatically. However, in Type III method, the number of fuzzy rules is determined by trial-and-error testing. Therefore, the average of the generations of the Type III method is larger than Type I method. Comparing Type I with Type V method, it is observed that DMSS can reduce CPU time because of the chromosomes from suitable groups can be selected to construct TNFS models with different numbers of rules. Comparing Type I with Type VI method, it is observed that DMCS can reduce CPU time. This is because the chromosomes from suitable groups can be selected to cooperate for generating better solutions. As shown in Table 5.12, the SAG-SEFA (Type IV) performs better (CPU time) than other methods.

Table 5. 12: Performance comparison of six different methods in Example 1.

CPU Time Method

Mean Best Worst Std.

Type I 45.93 16.41 76.87 27.39

Type II 68.26 26.63 131.25 51.43

Type III 58.43 22.18 98.91 38.55

Type IV 18.12 5.31 28.51 8.34

Type V 27.37 12.83 36.55 15.17

Type VI 24.23 9.19 33.71 13.09

在文檔中以改良安全性增強式學習為基礎的自我適應進化演算法應用於模糊類神經控制器設計之研究 (頁 116-123)

立即下載 "以改良安全性增強式學習..."

Outline

相關文件