Chapter 4 Image Alignment Applications
4.2.1 Learning Phase
The objective of the learning procedure is to train two TNFNs for applying coarse-to-fine 3D image alignment. These two major parts of the procedure are the coarse alignment learning and the TNFN-based surface modeling. These parts are described in the following contents.
(a) Coarse alignment learning
The goal of coarse alignment is to determine an approximate rigid transformation that coarsely aligns the reference model with the input point clouds. The coarse alignment must be quick to provide a good initial transformation for the fine alignment task. Thus, TNFN is utilized to learn any case of rigid transformation within the predefined range. Once the training of TNFN is completed, input arbitrary view of point clouds would yield the estimate pose with respect to the reference model. Therefore, the executing phase of the TNFN is simple and efficient.
The procedures of proposed coarse alignment learning involves generating synthesized training point cloud data, yielding the modified viewpoint feature histogram (MVFH), and
training the TNFN. These operations are introduced as follows:
(i) Generating synthesized training point cloud data
Figure 4.5 depicts the point cloud data of the reference model. The reference model is an integrated model constructed by collecting multi-views of point cloud data. To generate the synthesized training point cloud data, various combinations of translation and rotation transformations within a predefined range are applied in the reference model. The transformation can be considered a rigid transformation, which can be written as follows:
where R is a rotation matrix, T is a translation vector, s is an original set of point cloud data and m is a transformed set of point cloud data. Furthermore, to simulate the real case in a 3D scene, point cloud data that cannot be seen in the viewpoint direction are eliminated. Figure 4.6 presents an example of the simulated training data. As shown in this figure, the point cloud data is only a partial of reference model and the unseen point clouds have been eliminated. Therefore, after the training point data has been generated, the following operation is to extract the feature of the point cloud data.
(a) (b)
Figure 4-6: Example of the simulated training data: (a) Front view and (b) Top view.
(ii) Modified Viewpoint Feature Histogram
Modified Viewpoint Feature Histogram (MVFH) is the modification of Viewpoint feature histogram (VFH), which was presented by Rusu et al. [94], to show its computationally efficient 3D feature. To introduce VFH in advance, this descriptor is computed by accumulating a histogram of the angles between the central viewpoint direction and each normal of point cloud. Figure 4.7 illustrates the idea of VFH.
Figure 4-7: Creation of viewpoint feature histogram.
Suppose the central point is
V and the viewpoint is
cV . Then the central viewpoint
p direction isV
c −V
p. Thus the angleθ
between the central viewpoint direction (V
c −V
p) andeach normal
n of point cloud
iV can be computed by the following equation:
iThereafter, the N-bin orientation histograms (each bin cover 180/N degree) can be calculated by accumulating the angle described in Eq. (4.7). The histogram in each bin is normalized by dividing the total number of point clouds. Thus, such histogram indicates the percentage of point clouds falling in each bin. However, in 3D surface alignment tasks, the viewpoint direction angle to represent the 3D surface might be not appropriate because VFH in some much different view angles would yield similar feature, especially in the case of symmetrical objects with 180 degree view angle difference. Figure 4.8 illustrates an example of similar VFH with much different view angle. As shown in this figure, the object is at two much different viewpoints but they have similar viewpoint feature histogram.
that the 3D feature is utilized to identify the view angle and if the 3D feature is view independent, the captured feature would be similar in each view such that it is impossible to differentiate the exact view angles in an object. Regarding this fact, we modify the original viewpoint feature histogram by calculating another viewpoint direction related angle to improve the viewpoint feature histogram. Then we name such viewpoint direction as modified viewpoint feature histogram (MVFH). Figure 4.9 presents a diagram that describes two viewpoint direction related angles where
θ
is the original angle used by VFH,φ
is new added angle used by MVFH, the central point isV , the viewpoint is
cV , and
pV is a certain
i 3D point.Figure 4-9: Diagram describes two viewpoint direction related angles
θ
andφ
.The new added angle
φ
can be computed by the following equation:.
Then the N-bin orientation histograms (each bin cover 180/N degree) can be computed by accumulating the angle
φ
. Thus, MVFH is finished by dividing the total number of point clouds to normalize histogram in each bin. To demonstrate the improvement of the modified viewpoint feature histogram, we utilize the previous example presented by Fig. 4.8, which hassimilar VFH in much different view, to re-computed MVFH. Figure 4.10 depicted the computed MVFH. As shown in this figure, the first histogram and the second histogram have different shape. This example clarifies that MVFH correct the error of much different view with similar VFH.
Figure 4-10: Example of modified viewpoint feature histograms in much different view.
(iii) TNFN Training
After extracting MVFH from a 3D object, let MVFH be the input neurons of TNFN and let the desired pose be the output neurons of TNFN. The desired pose comprises six degrees of freedom, including three rotation angles (
φ
,ϕ
,θ
) and three translation parameters (x, y, z).Thus, the use of TNFN is to model the relationship between the MVFH and the desired pose.
Once receiving a MVFH from capturing a certain view of point clouds, the TNFN would
applying the transformation defined in Eq. (4.6). To reduce the correlations between training point clouds, the six parameters are selected randomly and independently within the predefined boundaries. After the training-set has been generated, the MVFH method is used to represent the training point clouds as input features of a TNFN. Subsequently, the proposed RGLS-HCCA would be adopted to begin training of a TNFN and the training procedure would stop as the stopping condition is satisfied. Although the training phase is lengthy, the executing phase of the proposed coarse alignment method merely consists of computing the MVFH descriptor and then feeding it into TNFN to estimate the corresponding pose.
(b) TNFN-based surface modeling
The purpose of the TNFN-based surface modeling is to provide an evaluation method for performing the fine alignment of 3D surface. The evaluation is to measure how close the distance from the reference surface to input point clouds is. Thus, the major part of the TNFN-based surface modeling is to use TNFN to model the 3D surface that maps the 3D Euclidean input space (input 3D point (x,y,z)) into 1D Euclidean output space (the shortest distance to the reference surface). Such mapping can be considered a cost function that evaluates the distance between the input point clouds and the reference model. Thus, the TNFN mapping can combine with the downhill simplex optimization method to iteratively compute the rotation matrix R and translation vector T to perform the fine alignment of 3D surface. The detail of the combination of the TNFN mapping and the downhill simplex optimization will be discussed in the executing phase.
The procedures of modeling the 3D surface involve combining the cube model, creation of training data, and surface modeling using TNFN. These operations are explained bellow.
(i) Combing cube model
To model the reference surface, uniform distributed point clouds are needed to prepare the training data. In this study, a cube model is generated to be combined with the reference model. The cube model encloses the reference surface, and the point clouds within the cube
are sampled uniformly. Thus, the point clouds around the reference model can serve as the training data for modeling the reference surface. Figure 4.11 depicts the locations of cube and reference model where the reference model is located at the center of the cube.
Figure 4-11: Location of cube and reference model.
(ii) Creation of training data
In the creation of training data, we extract the point clouds enclosed the cube satisfying the distance from a point (x,y,z) to the reference model less than a predefined value. The
predefined value is set by observing the alignment error yielded from the coarse alignment
case. Therefore, the point clouds (x,y,z) satisfies
Dist
(x
,y
,z
)≤predefined value
(4.9) will be used for training the TNFN. In general, the predefined value must be set sufficientpoint (x,y,z) to the reference model. Thus, the surface of the reference model can be modeled using the TNFN to map the 3D coordinate of point cloud data into the 1D distance between the cube data and the reference model. The representation of the modeling function can be written as follows:
Dist
=f
(x
,y
,z
). (4.10) The total distance between the cube data and the reference model can be computed as follows:( , , ),
∑
1=
= N
i
f x
iy
iz
iTotDist
(4.11) where N is the number of the cube model. Thus, when the resolution of the cube model is sufficiently high, any arbitrary point clouds inside the cube can be send into a trained TNFN to estimate the distance between the input point clouds and the reference model.In consideration of training a TNFN to model the reference surface, as well as the coarse alignment learning, RGLS-HCCA is also utilized to perform training the TNFN.