Profile Localization - The Extraction Method

Chapter 3 The Extraction Method

3.1 Profile Localization

Profile localization, similar to edge detection, is often applied in the first stage of an image recognition process to locate pixels as the basis of segmentation or matching. Many operators can be found in literatures to detect edges or corners in an image, e.g., Sobel operator[40], Harris detector[41], or Canny detector[42]. Most of them use gradient based detection and suffer from the difficulties in noise rejection and threshold determination. The extraction method in this work utilizes the DOG functions so that it minimizes the impact of noise and

Profile localization (

Determining boundary pixels

Initializing threshold

Dynamic threshold propagating

Thresholding (Binarization)

Eliminate false candidates Connected component analysis

makes robust extraction without prior filtering.

Fig. 3-2 An example of the SSB method

The profile localization consists of several steps as in the following procedures. At first, the gray-level input image, I(x,y), is respectively convolved with two Gaussian functions, g1(x,y) with deviation σ1 and, g2(x,y) with deviation σ2 to get two Gaussian images, I1(x,y) and I2(x,y).

And the difference of the two Gaussian images, D₁(x,y)= I₁(x,y) - I₂(x,y), is called the DOG image.

The two standard deviations, σ1 and σ2, of the two Gaussian functions are respectively called the first and the second observation scale. A smaller observation scale observes more details in an area but is more sensitive to noise. On the contrary, a larger observation scale is more stable against noise but may lose significant details of the interested characters or mix the interested characters with adjacent objects so that the characters become difficult to be extracted. In the

Source image Profile localization

(blue and red: profile pixels)

Determine boundary pixels (cyan: boundary pixels)

Thresholding

experiments we set the two scales σ1=1 and σ2= 2 for the profile extraction, which is proven by experiments a better choice for processing 16×16 to 64×64 character sizes in general 256-step (8-bit) gray-level images.

In order to deal with larger scale characters with minimum computation time, an efficient method in Fig. 3-3 is applied by sub-sampling the second blurred image I2(x,y) by every two pixels on each row and column to form a smaller image I₂'(x,y). Then based on I₂'(x,y) calculates the Gaussian filtered image I3(x,y) and their DOG image D2(x,y), and applies the same procedure again to localize the profile pixels. As a result, the observation scale w.r.t.

D2(x,y) is double to that w.r.t. D1(x,y).

A 2-D DOG function used to extract the characters can be expressed as,

( )

²²

Consider a case that an unit step edge u(x₀) exists in parallel to the y-axis(x= x₀), the position of peak response on convolving the unit step edge with a DOG function can be obtained by solving the differential equation, yields equivalent to that of the equation

(

x−x0,⁰

)

=⁰

DOG . (4)

Solving (4) to get the positions of the two peak responses at

Fig. 3-3 The procedure to produce DOG Images on different observation scales

A plot by equation (5) in Fig. 3-4 on x0=0 reveals that convolution of a unit step edge with the DOG function generates two odd-symmetrical peaks beside the unit step edge, i.e., positive peak A and negative peak B. The most valuable characteristic of the DOG function is that these peaks are quite stable even if the testing image consists of small undesirable artifacts such as noise, out-of-focus or variable illumination. Based on this result, the DOG image is divided into three sets by a fixed global threshold th_f and its complementary –th_f. The first set, Set₁, is

identification according to the way they appear. For characters having lower gray-level intensity (deeper color) than its nearby background, Set₁ is also called the inner profile set because it spreads interior characters’ boundaries. Similarly, Set2 is also called the outer profile set for the location it appears. The two profile sets are respectively drawn in Fig. 3-5 in blue and yellow colors.

The global threshold th_f is used for determining whether a change of intensity is caused by noise or a real edge. Smaller threshold collects more pixels into Set1 and Set2, and takes more computation time to deal with noise before extracting the characters. It is worth notify that the lowest threshold for DOG function can be set to thf=0. Although setting threshold to zero introduces much information generated by noise, it can still retain correct extraction results because that the energy of noise in the DOG response is automatically suppressed when it appears near an edge. As a result, it is recommended to set a small threshold, e.g., th_f =1, for all the input images because it ensures reliable results can be persisted with reasonable computation time regardless of the condition of the input image. Different from some other gradient operators which would possibly lose some character candidates if a smaller threshold is given, the only drawback for giving a smaller threshold in DOG function is higher computation time consumption. From various simulation results we can tell that a wide range of threshold on DOG images can still provide reliable results on localizing the profile pixels.

Fig. 3-4 An ideal unit-step edge (upper graph) and its DOG response (lower graph)

When a near-perfect input image like Fig. 3-6(a) is given for binarization, the first step is to find the corresponding two profile sets from the DOG image as in Fig. 3-6(b). It is worth to note that the pixels of the inner profile set often appear in a connected group, which is called the inner profile groups or simply profile groups. As in Fig. 3-6(c), the smallest rectangle covering the entire profile group is called the bonding rectangle of the profile group. Note that a profile group often represents the profile of an isolated character in normal case. However, it might happen that a character is broken into two or more profile groups due to special geometric distribution or noise or special lighting condition. The broken profile groups will be linked up by the connected component analysis later on to reveal the original characters.

According to (5), a constant R_eff is defined to represent the radius of the effective area of an edge (intensity change), and

where the function ceil(x) rounds x towards positive infinity. Note that the Reff is the horizontal distance of AC or BC in Fig. 3-4, or equivalently the radius of the circle of effective range in Fig.

3-5. In addition to the profile sets, a boundary set SetB is formed to represent the boundary of character candidates. A pixel pb is collected into SetB if it satisfies the following two conditions, 1. Except the zero-crossing pixels, i.e., the position C in Fig. 3-4, or the non-profile pixels in Fig. 3-5, the pixels inside the effective area of p_b belong to either inner or outer profile sets.

2. The total number of pixels belongs to inner profile set and the total number of pixels belongs to outer profile set inside the effective area of p_b are the same.

Fig. 3-5 Determine boundary pixels

(a) (b) (c) (d) (e)

Fig. 3-6 (a) A perfect sample character image. (b) The DOG responses: positive response in red and negative response in blue. (c) The inner profile set in red and the bonding rectangle in gray.

(d) Boundary set. (e) Extraction result.

In implementation, consider to discrete pixel coordinate and error tolerance, the pixels of Set₁ and Set2 inside the effective area of pb are accumulated into Bin1 and Bin2 respectively, and pb

is collected into Set_B if it satisfies the following equations:

( ( ) )

extracted as in Fig. 3-6(e) after dynamic threshold propagation and binarization.

Effective range

Boundary pixel

Non-Profile pixels

Inner profile pixel (Set1)

Outer profile pixel (Set2)

在文檔中利用尺度空間二值化與累積梯度投影的方法應用於車牌字體的擷取與辨識 (頁 23-30)