Multi-ring Mark: Overlap Improvement

Chapter 3 Structure and Algorithms

3.3 Light-mark Method

3.3.2 Multi-ring Mark: Overlap Improvement

The intention for multi-ring-mark algorithm is to enhance user recognition during marks overlapping. The flow chart of the algorithm is shown in Fig. 46. The projected light marks are detected by the embedded optical sensors and converted into a gray level image. After noise suppression, the system noise and ambient light effect can be reduced. There are two main stages in the algorithm: first is to locate the 3D coordinate (x, y, and z) by extracting the out-mark feature. Second is to differentiate users through the in-mark character.

Fig. 46. Flow chart of Multi-ring mark algorithm.

Position tracking: In the first stage, our goal is to define 3D coordinate (x, y, and z) of the users. To achieve the target, five image processing steps, including image enhancement process, image thinning operation, Ring filtering, peak detection, and iteration peak elimination, are employed in turn. In the following, we will discuss each process in detail.

Image enhancement process is executed after noise suppression, hence only the

main information which means the projected light marks would be reinforced. Due to the decay of light, a vague shape will be captured for a higher object. Hence in the image enhancement process, we convert the captured image to a binary image. For pixels with gray level larger than the threshold, they are reassigned to ones. And others are reassigned to zeros. By the image enhancement process, the image becomes a binary image. Besides, the intensity difference which results from the decay of light from unlike heights and non-uniform sensitivity of the display is eliminated.

Additionally, the image enhancement process happens to fit the requirement in our next step thinning operation where the input image should be a binary image.

Thinning operation [35] is used to reduce the line width of ring mark to single

pixel thickness. The thinning of an image A by a sequence of structuring element {B}, denoted A⊗B [36], can be defined as

A ⊗ {B} = ((… ((A ⊗ 𝐵¹) ⊗ 𝐵²) … ) ⊗ 𝐵^𝑛)

Eq. (2) In every term, the thinning operation is calculated by translating the origin of the structuring element, as shown in Fig. 47, to each point in the image, and comparing it with the underlying image pixels. If the foreground (one) and background (zero) pixel, in the structuring element exactly match to the image, then the image pixel underneath the origin of the structuring element is set to background (zero). Otherwise it is left

unchanged. After a single term of a thinning operation over the image, the operator is applied repeatedly until convergence, which means no further changes to the image.

By thinning operation, down to one pixel’s line width can be obtained, as shown in Fig. 48.

Fig. 47. Structuring elements for morphological thinning. Ones and zeros stand for foreground and background pixels and the blanks can be either one or zero which we don’t care about. At each term, the image is first thinned by B¹, then B², as so on.

The process is repeated until none of the thinning produces any further change.

Fig. 48. Light pens are positioned at 50mm and detected by the embedded optical sensors. Output images after image enhancement steps and thinning operation from the 1st term until convergence are shown.

Ring Filtering is able to determine the possible 3D coordinates (x, y, and z) of

users, called candidates. Ring filters are constructed according to the size of out-mark at different heights, as demonstrated in Fig. 49. By translating the origin of the ring filter to all points in the image, and processing convolution the ring filter with the underlying image pixels. The numbers of the foreground pixels in the ring filter that

match to the ones in the image are added up to the pixel underneath the origin of the ring filter.

Fig. 49. (a) Ring filter are constructed according to the size of out-mark at different heights, and (b) shows the structure element of ring filter at z=0.

Candidate Detection can locate the possible 2D coordinate (x and y) and

meanwhile determine the depth value (z) according to the results in ring filtering. A series of normalizing factors is constructed by accumulating thinning ring marks at different heights. Each ring-filtered image is divided by accordant normalizing factor.

The center of ring mark can therefore be defined with value close to 1 due to the similar accumulation of the captured ring mark to the normalizing factor. An example of the normalized accumulation within each ring filter is shown in Fig. 50. A local maximum which is found by a pixel greater than its 8-connective with its value close to 1 will be regard as a candidate. However, when the number of candidates that fit the requirements is larger than the number of the users, it indicates that some error or iterated candidates have been detected. As shown in Fig. 50, 4 candidates have been found. Hence in the next step, we’ll try to remove the redundant candidates.

Fig. 50. In candidate detection, the accumulation results at different ring filter (z) are divided by the corresponding normalizing factors. The local maximum with its value close to 1 will be defined as the candidates.

Iteration Removal process is presented to effectively eliminate the redundant

candidates. If a distance between two candidates is smaller than the according in-mark window, which will be discussed in the next paragraph, the candidates might actually represent the same ring where only a user exists. Hence, the candidate with value closest to 1 will be reserved while the other will be removed. Such as the example in Fig. 50, 2D coordinates (x and y) of the candidates in ring-filtered image z=25 and z=30 are almost the same. Therefore, by iteration removal process, only candidates at z=0, 10, and 30 are remained. The 3D coordinates (x, y, and z) of users can be obtained.

User identification: Once the 3D coordinates of the users are found, the next stage is to assign each coordinate to the corresponding user.

In-mark acquisition is a step to extract characteristic mark in order to achieve

user identification. As shown in Fig. 51, the size of in-mark window is as well a

3

function of z. Thus, by knowing 3D coordinate (x, y, and z) of the user, the origin of in-mark window (z) is positioned on the image at accordant 2D coordinate (x, and y).

Therefore, the user character can be obtained without affect by outer ring mark.

Fig. 51. In-mark window for in-mark extraction.

Ranking process is the last step to match the 3D coordinates to the users. After

in-mark acquisition, users can be identified easily by ranking the accumulation of in-marks without normalizing windows to the same size. Due to the previous thinning operation, a solid circle converges to a single point while a ring stays a ring shape with radius equals to the average of the original image. As demonstrated in Fig. 52, user 1 with a blank in-mark can be recognized by minimum in-mark ranking. User 2 with a solid circle as in-mark has middle ranking due to the thinning operation. And User 3 with a ring is defined by maximum in-mark ranking. Therefore, 3D coordinates are assigned to accordant users successfully.

Fig. 52. Ranking process is able to define different users.

在文檔中提升三維多點互動系統使用者辨識能力之多圖騰演算法開發 (頁 55-61)