• 沒有找到結果。

LAGEP: Evolving a Population

F

irst, we recall some symbol definitions. A training instance is an object whose class

label is already known. We define a training instance and the training set T as:

T = {xi|1 ≤ x ≤ N} , xi = {ai1, ai2, . . . , ain, ci} ,

where ai stands for the i-th feature, n stands for the number of features, N is the num-ber of training instances, and ci, ci ∈ C = {c1, c2, . . . , cK}, stands for the class label of the instance xi.

3.2.1 Individual Definitions

A

s mentioned in Chapter 2, a population subjects to GP is a set of individuals. We denote a population as P , and P = {I1, I2, ..., Im}; where Ii is an individual and m is the number of individuals of P called population size. In this work, an individual is a discriminant function represented by a tree structure. The individual tree is composed of an operation set Sop, a variable set Sv, and constant set Sc. Variables are symbolic no-tations related to features of training instances. Ai stands for the i-th feature of given instances. Sc is a set of predefined constants. We define Sc as ten floating numbers in [0, 1] because the attribute values of classification datasets used in this paper are normalized to [0, 1]. (The classification dataset will be described in Chapter 4.) Sop

can contain logarithmic operations or trigonometric functions, but we use only simple arithmetic operations for two reasons: First, Kishore et al. [35] performed experiments to show that the classification accuracy of using only simple arithmetic operations is sufficiently high. Second, using simple operations are able to reduce computational cost because individuals generated by a compact operations set are simple and effi-cient. Therefore, Sop, Sv, Sc, and I are defined as follows:

Sop={+, −, ×, /}

Sv ={A1, A2, . . . , An} Sc ={0.1, 0.2, 0.3, . . . , 1.0}

I = (Sop, Sv, Sc)

Note that the division ”/” of Sv is a protected division. It returns1.0 when the denom-inator is zero.

The structure of individuals is that of a binary tree because operations are binary operations. The maximum number of available nodes of an individual is predefined and is called the individual length, denoted as IL.

3.2.2 Fitness Function

T

he fitness function FitnessFunc is a function used to evaluate the fitness of every individual. When we perform the training task of a classification problem, one needs to know which class is the target class. The target class is the class label for which one trains the system to find solutions. Training instances are divided into positive instances if they belong to the target class and negative instances if they do not. For a given training instance x, we say of an individual Ii:

Iirecognizes training instance x if and only if Ii(x) ≥ 0;

Iirepels training instance x if and only if Ii(x) < 0.

We try to find an individual that recognizes positive instances and repels negative in-stances. An individual is capable of classifying a set of inin-stances. We define a function Accwith an individual I and a dataset S by:

Acc(I, S) = the number of objects ofSthat are correctly classified byI

|S| (3.1)

The fitness function, FitnessFunc, used in this work is made by

fi = FitnessFunction (Ii) = Acc (Ii, T) . (3.2) We use such a fitness function for two reasons. First, an accurate discriminant function is desired. Second, FitnessFunc will be computed many times so it should be as simple as possible.

3.2.3 Validation

T

he overfitting problem occurs when the trained solution excessively adapts to the train-ing set. Durtrain-ing the traintrain-ing phase, it is difficult to detect whether the overfitttrain-ing occurs or not because the test instances are totally unknown. Validation process can be used to avoid overfitting. The validation process uses a set of validation instances, V , to check the generalization of individuals. A good individual should derive good performance from the training set, i.e., a high fitness value, and derive high classification accuracy from the validation set. When all generations finish, the validation performance of the best individual from within each generation can be obtained with V . For this purpose, we define the measure scoreiof an individual Iias:

scorei = ScoreFunc(Ii) = fi+ Acc (Ii, V) . (3.3)

The population’s best individual is the one that has greatest score and is denoted asΦ [75].

3.2.4 Elitism Evolution Strategy and the Evolutionary Flowchart

M

any evolutionary strategies have been proposed. In this work we use the elitism strategy in which only favored individuals remain to next generation. At first, repro-duction is no longer governed by probability. Once the fitness evaluation process has finished, a predefined number, r, of superior individuals are reproduced into next gen-eration directly. In order to maintain the diversity of the population, r should be small.

The crossover operator and the mutation operator are also modified. For the crossover, we compare the fitness values of the parent individuals, fi and fj, to the fitness values of two offspring, fi and fj. The best two of these four individuals will be inserted into the next population. Similarly, in the case of mutation, only the one which has highest fitness value between the parent and the mutant can survive.

We use the deterministic tournament selection method to select individuals. This

Initialize P, g= 1 Evaluate fitness values

for individuals in P Output

g > G? Evaluate score values for all individuals in P

Compare fitness values Evaluate fitness values of offspring

No

Yes

No

Yes

Figure 3.1: The flowchart of GP elitism evolution processes. G is the maximum gener-ation, P is the current populgener-ation, Pis the population of next generation, and N is the population size.

method at first chooses a number of individuals, called tournament size, from the pop-ulation at random and then returns the individual with the highest fitness value.

The modified GP evolution flowchart using elitism is shown in Figure 3.1.

3.2.5 AMPT: Adaptive Mutation Probability Tuning

T

he mutation operator is capable of generating individuals with new structures and is mainly used to escape local optimums. Given a high probability of mutation, the population tends to generate diverse individuals instead of utilizing solutions present

in extant individuals. Moreover, a high probability of mutation makes the GP system becomes a random search model, with it is difficult to converge and generate stable results. Such problems rarely arise however because the probability of mutation is usually much lower than the probability of crossover. As a result, there are insufficient opportunities for an individual to be mutated so that the diversity of the population is limited. We denote probabilities of executing crossover and mutation with pc and pm, respectively. Since there is no guide to define pc and pm, we propose a method called adaptive mutation probability tuning (AMPT) to raise pm so that the mutation operator can perform more frequently when the generation increases.

The AMPT method tunes pm according to fitness values. In case individuals have similar fitness values, AMPT is triggered to increase pm. Otherwise, the population uses the initially given pm. Moreover, because pm+ pc = 1, to increase pmimplies to de-crease pc. AMPT is performed every generation. At generation g the AMPT considers remaining generations to change pmand pcby:

(pm, pc) =

where G is the maximum generation, fMAXis the fitness value of the best individual in generation g; and fAVERAGEis the average fitness value of all individuals in generation g. From this formula, pm increases smoothly and achieves a value of 12 at the final gen-eration, which means that the mutation operator and the crossover operator have the same chance to be selected.

We draw the curve of pm in Figure 3.2 under the following conditions: G = 100, (pm, pc) = (0.05, 0.95); and fMAXshould never exceed2fAVERAGEduring these100 gener-ations. This curve shows that pm increases smoothly and terminates at0.5 at the 100th generation.

Figure 3.2: The growing curve of pmusing AMPT.

相關文件