• 沒有找到結果。

Chapter 1 Introduction

1.4 Organization of Dissertation

This dissertation is divided into six chapters. Chapter 1 introduces the motivation, related work, approach, and organization of the dissertation. Chapter 2 provides the fundamental information used in the dissertation. The foundation includes regularized least squares method, neural fuzzy network, cooperative coevolutionary learning, 2D image alignment, and 3D image alignment. In Chapter 3, RGLS-HCCA is described. RGLS-HCCA consists of the RGLS method and the two-level evolutions: parameter level evolution and structure level

image alignment tasks, respectively. In Chapter 6, the conclusions and future work of the dissertation are discussed.

Chapter 2 Foundations

In this chapter, three major backgrounds of cooperative coevolutionary learning, 2D image alignment, and 3D image alignment are introduced. For the cooperative coevolutionary learning, the typical SANE method is used to specify how to perform evolutionary learning.

For 2D and 3D image alignment, the procedures of aligning 2D and 3D images are described and alignment results of general 2D and 3D image alignment methods are briefly presented.

This chapter is divided into five subsections. The concepts of the regularized least squares method and neural fuzzy network are introduced in Section 2.1 and 2.2, respectively. In Section 2.3, the general method of cooperative coevolutionary learning is described. Section 2.4 and 2.5 will discuss how to perform 2D and 3D image alignments tasks.

2.1 Regularized Least Squares Method

Before discussing the regularized squares method, the least square method is introduced.

Give a target vector y, and data matrix X. The most popular loss function used for regression problems is the residual sum of squared errors (RSS):

RSS

=

Xw

y

22. (2.1) The least square method is defined as setting w to minimize the expression. Thus, differentiating Eq. (2.1) with respect to w can obtain:

the numerical instability of the matrix inversion. The method of regularization adds a positive constant to the diagonals of

X

T

X

to make the matrix nonsingular. Thus, the expression of Eq. (2.3) can be switched to:

w

=(

X

T

X

+

λ I

)1

X

T

y

, (2.4) where λ is a regularization parameter. Since Eq. (2.4) is used to solve the least square problem, Tikhonov regularization is called regularized least squares [62], which is also called damped least squares [63-65]. Moreover, to differentiate from the abbreviation of recursive least square (RLS), this paper takes the idea from [66] to abbreviate regularized least squares to RGLS.

In addition to RGLS to solve the problem of the matrix

X

T

X

being singular, pseudo inverse is another solution. Thus, in the section of experimental results, this dissertation will compare regularized least squares with pseudo inverse.

2.2 Neural Fuzzy Network

In Lin and Peng’s work [2], there are two typical types of neural fuzzy network (NFN) and they are Mamdani-type [5] and TSK-type [4]. According to [6] and [67], the authors have shown that the TSK-type NFN can offer better network size and learning accuracy than the Mamdani-type NFN. Thus, in this dissertation, only the TSK-type NFN is introduced and such NFN is applied to image alignment applications.

A TSK-type neuro-fuzzy network (TNFN) [4] employs a linear combination of the crisp inputs as the consequent part of a fuzzy rule. The fuzzy rule of the TSK-type neural fuzzy system is shown in Eq. (2.5), where n and j represent the dimension of the input and the number of the fuzzy rules respectively.

IF x

1

is A

1j

(m

1j

, σ

1j

)and x

2

is A

2j

(m

2j

, σ

2j

)and…and x

n

is A

nj

(m

nj

, σ

nj

)

THEN y′ =w

0j

+w

1j

x

1

+…+w

nj

x

n.

(2.5)

The structure of TNFN is shown in Fig. 2.1, where n represents the dimension of the

input. It is a five-layer network structure. The functions of the nodes in each layer are described as follows:

Layer 1 (input node): Each node in this layer is called an input linguistic node, which corresponding one linguistic variable. These nodes only pass the input signal to the next layer.

u

i(1) =

x

i,

(2.6)

where

u denotes the ith node’s input in the first layer and

i(1)

x denotes ith input dimension.

i The number of nodes in this layer is the dimension of input vector.

Layer 2 (membership function node): each node in this layer acts as a Gaussian membership function, and its output value specifies the degree to which the given input value belongs to a fuzzy set. Thus, the membership value in layer 2 can be calculated by:

[ ]

the width of the Gaussian membership function of the jth term of the ith input variable

x ,

i respectively. In this paper, the reason of adopting the Gaussian membership function is that it can be a universal approximator of any nonlinear functions [6]. Besides, the number of nodes in this layer is the dimension of input vector multiplied by the number of fuzzy rules.

Layer 3 (rule node): The output in this layer is used to perform precondition matching of fuzzy rules. In the TNFN, the firing strength of a fuzzy rule is calculated by performing the

the fuzzy rule and it can be written by:

where the summation is the consequent part and

w is its corresponding parameters. The

ij number of nodes in this layer is the dimension of output vector multiplied by the number of fuzzy rules.

Layer 5 (output node): The node in this layer computes output signal. The output node integrates with links connected to it and acts as a defuzzifier with:

, rule node, and M is the number of a fuzzy rule. The number of nodes in this layer is the dimension of output vector.

Figure 2-1: Structure of TNFN.

2.3 Cooperative Coevolutionary Learning

Evolutionary algorithms (EAs) are the methods for solving difficult problems using notions of Darwinian evolution. EAs have been applied to many applications and the major benefit of EAs over traditional local search methods is their parallel search ability. However, EAs have difficulty in scaling to large problem domains. For solving this problem, researches have extended EAs to cooperative coevolutionary algorithms (CCEAs). Instead of solving the entire problem, the notion of cooperative coevolutionary learning is to reduce the complex of difficult problems through modularization. In other words, a difficult complete problem can be divided into small simple problems. In CCEAs, each individual represents only a partial solution and a full solution is built by means of cooperating with other partial solutions. Thus, each individual can be evolved locally and recombined it with other well-performed individuals to form a good total solution.

Symbiotic adaptive neruoevolution (SANE) is one of typical CCEAs. In SANE, partial solutions can be viewed as specializations. It indicates that partial solutions specialize toward one aspect of the full solution. To concern with fitness evaluation, the fitness of an individual is calculated by summing all combinations of that individual with other individuals and dividing by the total number of combinations. Thus, the fitness value reflects an average value of combined full solutions. Fig. 2.2 presents the basic steps of SANE. As shown in this figure, there are nine steps of SANE and they are described as follows.

Step1. Initialization: in this step, all fitness values are clear and all genes of individuals are

Step5. Selection times check: each individual must be selected sufficient times. If the selection time does not satisfy, then go to Step2 to continue the selection step.

Step6. In this step, the average fitness value of an individual is computed by dividing the total fitness value of each chromosome by the number of times that it has been selected to build networks.

Step7. Termination check: check the fitness value with respect the whole network not a single individual. If the fitness value of the whole network satisfies the pre-setting value, then SANE terminate.

Step8. Crossover: a one-point crossover strategy is used to exchange the site’s values between the selected sites of individual parents to create new individuals, which are offspring inheriting the parents’ merits.

Step9. Mutation: in the last step, the gene is mutated at the rate 0.1% drawn randomly from the domain of the corresponding variable. Then go to Step 2 to perform selection.

Figure 2-2: Basic Steps of SANE.

Although SANE can obtain better performance than traditional evolutionary approaches, it still has the problem that the algorithm cannot evaluate each partial solution independently.

More specifically, SANE use only one population to evaluate every partial solution, this will

cause partial solutions too similar. Therefore, the algorithm may have less chance to obtain optimal solution. To this end, MGCSE [15], which is a previous evolutionary algorithm and similar to ESP, was proposed for evolving TSK-type neural fuzzy networks. Compare to SANE, MGCSE provide several groups to evaluate each partial solution. Each group in the MGCSE represents a group that consists of the set of the chromosomes that belongs to the partial solution. In MGCSE, the population consists of several sub-populations and each sub-population represents the set of the chromosomes that belongs to one fuzzy rule. The structure of the chromosome is shown in Fig. 2.3. In this figure, each fuzzy rule represents a chromosome that is selected from a group, Psize represents there are Psize groups in a population, and “Mk” represents Mk

fuzzy rules are used to construct a TSK-type neural fuzzy

network.

Gaussian membership function with mean and deviation, respectively, and

w is the weight

ji with ith dimension and jth rule node.

m

1j

σ

1j

m

2j

σ

2j

m

nj

σ

nj

w

j0

w

j1

w

j2

w

jn

Figure 2-4: Coding a fuzzy rule of a TNFN into a chromosome in MGCSE.

However, MGCSE have difficulty in scaling to more complex tasks or high input dimension of networks, conduct the problem of the random group selection of fuzzy rules, and the lost of potential fuzzy rules combinations. In consideration of the lost of potential fuzzy rules combinations, Gomez had proposed HESP to accomplish it. Nevertheless, HESP suffers from the problems that the lengths of chromosomes must be the same and the number of neurons has to be assigned in advance. To this end, this dissertation proposes RGLS-HCCA to address the above mentioned problems.

2.4 2D Image Alignment

In this subsection, a 2D image alignment task is introduced. Image alignment can be viewed as a mapping between two images by means of a geometric transformation. Typically, geometric transformation contains many types, including affine, similarity, and projective transformation. Among them, affine transformation is the most common used type and it composites of translation, rotation, and scaling. Thus, this paper adopts the affine transformation as the transformation model. Figure 2.5 shows an example of a remote controller with different transformation parameters. In Fig. 2.5 (a), it represents a reference image which other transformed images want to align with. In other words, if the pose of the transformed image is known, then the transformed image can be recovered to the original pose of the reference image by reversing the pose. Thus, a 2D image alignment task defined in this dissertation is to align transformed images with the reference image.

(a) (b)

(c) (d)

Figure 2-5: Example of generating training images with different affine transformation: (a) reference image, (b) translation, (c) clockwise rotation, and (d) counterclockwise rotation.

Since industrial inspection tasks are assumed, area-based alignment methods that adopt global descriptors are recommended. Thus, this study tries to focus on developing a good area-based alignment method. Figure 2.6 illustrates a typical procedure of an area-based 2D image alignment system. As shown in this figure, the sensed image is sent into the descriptor to extract the feature. Then, feed the feature into a pose estimation block to estimate the pose with respect to the reference image. Finally, the estimated affine transformation parameters can be used to align the sensed image with the reference image. Toward this end, seeking accurate affine transformation parameters is the most important fields for aligning images.

Figure 2.7 illustrates an example of aligning 2D images where figure (a) is a reference image, figure (b) is an input image, and figure (c) is a alignment result of using neural network based alignment scheme defined in [44]. In Fig. 2.7 (c), the cross sign denotes the estimated results of Sarnel’s work [44] and from the location of this cross sign, the alignment results is not good enough. The major drawback of such approach is that they have difficulty in applying to align images on a large range of affine transformation. Thus, this dissertation proposes a CNFN-based 2D image alignment method to perform coarse-to-fine alignment of the sensed image and the reference image.

(a) (b)

(c)

Figure 2-7: Example of 2D alignment: (a) reference image, (b) image with an affine transformed, and (c) alignment results of neural network based scheme.

2.5 3D Image Alignment

The 3D image defined in this dissertation is a range image which is scanned by an

imaging laser scanner. Each pixel in the range image reflects a range data which indicates a distance from the sensed point to the scanner. In other words, the range data can be considered as a 3D point with respect to the scanner. Thus, the scanner can be a center of a coordinate system to represent each sensed range data. Figure 2.8 presents an example of the range image, intensity image, and a 3D point cloud data. From this figure, the range image utilizes the color bar to represent the range data. The intensity image, which is also generated by the imaging lasers scanner, is used to be the corresponding map of range image. The 3D point cloud data, which is created by transforming range data to Cartesian coordinate, shows the 3D position of each pixel.

Figure 2.9 illustrates the procedure of a 3D image alignment task. From this figure, the 3D scene is scanned by a 3D imaging laser scanner where the size of the scanned scene is 256×256 with 20 degree field of view. The region of interest (ROI) is extracted by using the segmentation algorithm described in [68]. The reference model is a target 3D surface that the ROI wants to align with. Thus, the purpose of the 3D image alignment task is to align the ROI with the reference model.

Figure 2-8: Example of 3D image.

Figure 2-9: Procedure of a 3D surface alignment task

Intensity image

Range image

Coordinate Transformation 3D point cloud

3D scene

Segmentation

Align ROI

Reference Model

Alignment Result

According to Chapter 1, a coarse-to-fine technique is a useful way to perform 3D image alignment tasks. In consideration of coarse image alignment, common methods [45] and [46]

utilized PCA [51] for coarsely aligning two images due to its high-speed performance. In consideration of traditional fine alignment methods, iterative closest point (ICP) [52] is a typical method to iteratively calculate the rigid-body transformation to minimize the cost function.

Figure 2.10 illustrates an example of aligning an input 3D point with reference model using PCA. From this figure, (a) and (b) represents the principal axes of a 3D reference model and input 3D point data, respectively. Figure 2.10 (c) depicts the alignment results of PCA method. From Fig. 2.10 (a)-(c), we can know that since the input laser scanned 3D data is partial, its principal axes would be askew with respect to the 3D reference model and such case results in the large alignment error of PCA method (seen from Fig. 2.10 (c)). Based on this fact, this dissertation will propose a TNFN-based coarse alignment method that utilizes the pose estimation to replace of aligning principal axes.

Figure 2.11 illustrates an example of performing ICP fine alignment where figure (a) is the initial alignment yielded by PCA coarsely alignment and figure (b) is final fine alignment performed by ICP. Although ICP can get a good result for fine alignment, its heavy computational cost in searching corresponding points is a problem. To this end, this paper proposed a TNFN-based fine alignment method which combines surface modeling and the downhill simplex optimization method to improve the problem.

Figure 2-10: Example of coarse alignment using PCA: (a) the principal axes of the reference model, (b) the principal axes of the input 3D data, and (c) alignment results of the PCA method.

Figure 2-11: Example of fine alignment using ICP: (a) the initial alignment yielded by PCA and (b) alignment results of the ICP method.

(a) (b)

(c)

(a) (b)

Chapter 3

Regularized Least Squares Based Hierarchical Cooperative Coevolutionary Algorithm

The learning process of RGLS-HCCA is shown in Fig. 3.1. As show in this figure, RGLS-HCCA involves two major evolutions: parameter level evolution (PLE) and structure level evolution (SLE). The blocks of inserting good networks and inserting good neurons (i.e.

good fuzzy rules) are the connection between the parameter and structure level evolution.

These two operations indicate that good evolved results in one level evolution would be transferred to another level evolution. Once receiving good neurons or networks, the received chromosomes would be mated with other old chromosomes to yield some new offspring.

Therefore, by exchanging the good information between two levels of evolution, we have more chance to find the global optimal solution.

This chapter is divided into two subsections to introduce the proposed two-level evolution. In Section 3.1, parameter level evolution is discussed. Section 3.2 describes how structure evolution works.

Figure 3-1: Learning process of RGLS-HCCA.

3.1 Parameter Level Evolution

In this subsection, we will discuss the parameter level evolution (PLE). In PLE, it aims to determine not only the suitable fuzzy rules of TNFN automatically but also the suitable individuals used to construct a TNFN. Regarding the former aim, PLE proposes a self-regulated mechanism (SRM) to determine the number of fuzzy rules automatically. SRM utilizes the probability vector to represent the suitability of TNFN with different fuzzy rules.

In Fig. 3.2, SRM codes the probability vector

Mk

P to represent the suitability of a TNFN

with Mk rules where the number of fuzzy rules is limited to a certain bound, i.e., [Mmin, Mmax].

After the SRM is carried out, the probability of the suitable number of fuzzy rules in a TNFN will increase, and the probability of the unsuitable number of fizzy rules in a TNFN will decrease. Therefore, the number of fuzzy rules would be self-regulated. Regarding the later

aim, although SRM can determine the suitable number of rules, there is a need to identify the suitable groups used to select individuals to construct TNFN. More specifically, we should consider the well-performing groups of individuals to cooperate for producing better a generation than the current one. To face this issue, this study proposes a data-mining based selection method (DMSM) to determine which groups should be used to select individuals.

The DMSM involves two major parts, namely, finding frequent patterns and mining association rules. Regarding the former, the FP-growth algorithm [27] is used to find the frequent patterns that do not have candidate generation. Regarding latter, association rules are identified by using the confidence value. In DMSM, the FP-growth is used to find the sets of groups that occur frequently from transactions. In this paper, a “transaction” refers to the collection of groups that have good or bad performance. After the candidate sets of frequently occurring groups are found, DMSM identifies the association rules by setting the suitable confidence and uses the found association rules to determine Mk groups that are used to select

M

k chromosomes that form TNFN with Mk rules. To this end, two actions are defined in this study: normal and explore actions. In the normal action, Mk groups are chosen randomly. In the explore action, Mk groups are chosen according to association rules. These two actions will be discussed in the procedures of PLE.

Mmin

P P

Mmin+1

P

MkPMmax1

Mmax

P

Figure 3-2: Coding the probability vector to represent the suitability of a TNFN with Mk rules.

To consider the structure of TNFN, unlike MGCSE encoding one fuzzy rule into a

population, and Mk indicates that there are Mk

rules used in TNFN construction. In addition,

PLE adopts the variable length of a combination of chromosomes with RGLS method to construct a TNFN. Thus, the length of combined chromosomes to construct TNFNs can be different.

Figure 3-3: Structure of chromosomes to TNFN construction in PLE.

After discussing the structure of chromosomes to construct TNFNs, details of the coding step for PLE and RGLS method are described as follows:

(1) Coding Step:

The coding structure of chromosomes in the proposed PLE is shown in Fig. 3.4. This figure describes an antecedent part of a fuzzy rule that has the form in Eq. (2.5), where

m

ij

and

σ

ij represent a Gaussian membership function with mean and deviation of ith dimension and jth rule node, respectively. Besides, a pair of (m,σ ) indicates a neuron in Layer 2 of a TNFN. Evolving an antecedent part of a fuzzy rule is likely to evolve a neuron which is a parameter of a neural network. Thus, the evolution of this level is called a parameter (i.e.

neuron) level evolution.