Chapter 5 The Recognition Method
5.2 The AGPV Methods
5.2.2 Calculate AGPVs
Once the axes of a character are determined, the next step is to calculate the accumulated gradient projection vectors(AGPVs) based on these axes. On each axis of corresponding peak pk, the gradient magnitudes of pixels whose gradient orientations fall inside the range
( )
kk x,y e
s <θ < are projected and accumulated. The axis could be any one in the nature axes or augmented axes.
5.2.2.1. Projection principles
The projection axis, ηφ, is chosen from either nature axes or augmented axes with positive direction φ. Fig. 5-3 figures out the projection of sample pixel (x, y) and the center of gravity (COG) point of an object.
Fig. 5-3 Gradient projection of COG point and any other pixels
Let the (xcog, ycog) be the COG point of the input image, i.e.,
where (xi, yi) is the i-th pixel and N is the total number of pixels of a character candidate. Let the function A(x,y) denote the angle between pixel(x,y) and the x-axis, i.e.,
( )
The process of projecting a character onto axis ηφ can be decomposed into three operations.
First, rotate the character by angle ∆θ =
(
A(
xcog,ycog)
−φ)
. Second, scale the rotated pixels by a projection factor cos(∆θ). And third, translate the axis origin to the desired coordinate. Apply the process on the COG point, the coordinate of COG point after rotation is( ) ( )
Scaling by a projection factor cos(∆θ), it becomes
( ) ( )
ycog) and minimizes the axis length to accumulate the projected samples.5.2.2.2. Gradient projection accumulation
In this section, the pre-computed gradient orientation and magnitude will be projected onto specific axes then summed up. Only sample pixels of similar gradient orientations are projected onto the same axis. As the example in Fig. 5-4, an object O is projected onto axis η of angle 0-degree. In this case, only the sample pixels of gradient orientations θ(x,y) near 0-degree will be projected onto η and then accumulated.
Fig. 5-4 Accumulation of gradient projection
According to axes types, there are two different cases to select sample pixels of similar orientations. For nature axis corresponding to k-th peak pk, the sample pixels with orientation θ(x,y) ranged inside the boundaries of the pk, i.e., sk < θ(x,y) < ek, are projected and accumulated.
For augmented axis with angle φ, the sample pixels with gradient orientations θ(x,y) ranged by θ(x,y)≥ φ-22.5 and θ(x,y)≤ φ+22.5 will be projected and accumulated. From (17) and (25), the
projected gradient magnitude, mˆ
( )
x,y , and the projected distance, ˆl( )
x,y of sample pixel (x,y) onto axis ηφ are respectively( ) ( )
x,y =m x,y ×cos(
θ( )
x,y −φ)
mˆ , (26)
and
( )
x,y(
xproj xpcog) (
2 yproj ypcog)
2ˆl = − + − . (27)
To accumulate the gradient projections, an empty array R(x) is created with length equals to the diagonal of the input image. Since the indexes of an array must be integers, linear interpolation is used to accumulate the gradient projections into the two nearest indexes of the array. In mathematical representations, let b=floor(ˆl
( )
x,y ) and u=b+1, where floor(z) rounds z to the nearest integers towards minus infinity. For each sample pixel (x,y) on input image I, do the following accumulations,( ) ( ) ( )
b R b mˆ x,y(
u ˆ( )
x,y)
R = + × −l ; R
( ) ( ) ( ) ( )
u =R u +mˆ x,y ×(
ˆl x,y −b)
. (28)Besides R(x), a second array, T(x), is also created to collect overall information required for normalization. There are two differences between R(x) and T(x). First, unlike R(x) targeting on only the sample pixels of similar orientation, T(x) targets on all the sample pixels of a character and accumulates their gradient magnitudes. Second, R(x) accumulates the projected gradient magnitude mˆ
( )
x,y , while T(x) accumulates the original gradient magnitude m(x,y). Referring to eq.(28),( ) ( ) ( )
b T b m x,y(
u ˆ( )
x,y)
T = + × −l ; T
( ) ( ) ( ) ( )
u =T u +m x,y ×(
ˆl x,y −b)
. (29)The purpose of T(x) is to collect the overall gradient information of the interested character candidate for normalizing array R(x) into desired AGPV.
5.2.2.3. Normalization
The last step to find out the AGPV of an axis is to normalize the gradient projection
accumulation array R(x) into a fixed-length vector. With the fixed length, the AGPVs have standard dimensionality and can be compared with standard AGPVs easily. Before the normalization, the length of AGPV, LAGPV, has to be determined. Depends on the complexity of recognition targets, different length of AGPV may be selected to describe the distribution of projected gradients. In our experiments, the LAGPV is chosen as 32. A smaller LAGPV lowers the resolution and degrades the recognition rate while a larger LAGPV slows down system performance and makes no significant difference on recognition rate. It is worth to note that, one AGPV formed upon an axis is independent from the other AGPVs formed upon different axes. This is important to make the AGPVs independent from one another regardless of the source character and axes.
In order to avoid the impact of isolated sample pixels which are mostly caused by noise, the array R(x) is filtered by a Gaussian filter G(x):
( ) ( ) ( )
x R x *G xR~ = , (30)
where the operator * stands for convolution operation. The variance of the G(x) is chosen as σ
=(LenR)/128 in the experiments, where LenR is the length of R(x). It is found that this choice benefits in both resolution and noise rejection. Similarly, the array T(x) is also filtered by the same Gaussian filter to eliminate the effect of noise. After Gaussian filtering, the array T(x) is analyzed to find effective range, the range in which the data is effective to represent a character.
The effective range starts from index Xs and ends in index Xe, defined as
( ) ( )
{
s s T T s}
s x ,T x th ;T x th x x
X = ≥ < ,∀ < , (31)
and
( ) ( )
{
e e T T e}
e x ,T x th ;T x th x x
X = ≥ < ,∀ > , (32)
where the threshold thT is used to discard noise and is chosen as thT =Max(T(x))/32 in the experiment. The effective range of R(x) is assigned to be the same as the effective range of T(x), from Xs to Xe.
As mentioned previously, the gradient projection accumulation results in a large sum along a straight edge. This is a good property if the interested character is composed of straight edges.
However, some characters may consist of not only straight edges but also some curves and corners which only contribute small energy on array R(x). In order to balance the contribution of different types of edges and avoid the disturbance from noise, a threshold thR is used to adjust the content of array R(x) before normalization,
( ) ( )
After finding the effective range and adjusting the content of array R(x), the accumulated gradient projection vector(AGPV) is defined to resample from Rˆ
( )
x ,( ) ( )
accumulation array R(x) and normalized AGPV. The example uses the same test image as Fig.5-1 and displays only one of the nature axes, axis E. Similar to the method of finding the peaks of orientation histogram, the k-th effective peaks, epk, on R(x) is defined as R(epk)> R(epk -1) and R(epk)> R(epk +1). It can be observed that four effective peaks exist in Fig. 5-5(c) and each of them represents an edge projected onto axis E in Fig. 5-1(c).