Message extraction algorithm - Message Extraction

Chapter 5 A New Data Hiding Technique via Message-rich Character Image for

5.4 Message Extraction

5.4.4 Message extraction algorithm

A detailed message extraction algorithm is as follows.

Algorithm 5.2. Message extraction.

Input: a captured version Id of a message-rich character image.

Output: the message M embedded originally in Id. Steps:

Stage 1 － localizing the message-rich character image.

Step 1. Find the largest non-white quadrangle Q in Id by the Hough transform and polygonal approximation.

Stage 2 － correcting geometric distortion.

Step 2. Perform an inverse perspective transform on Q to correct the perspective distortion and take the Y-component of Q as the captured modulated message image IM′′.

Stage 3 － identifying blocks in the message image.

Step 3. Compute the gradient value gxy of each pixel Rxy in IP′′ according to (24).

Step 4. For each possible value nj of NS, compute the average gradient values AGn_j of the pixels on each candidate spitting line according to (26).

Step 5. Select the value nj yielding the largest AGⁿj for use as the desired number NS

of blocks of IM′′ in the horizontal or vertical direction; and divide IM'' accordingly into blocks.

Stage 4 － binarizing the blocks to extract the message.

Step 6. Binarize each block Fi' by moment-preserving thresholding [73].

Step 7. Group every four mutually-connected binarized blocks to form character images I_ci′.

Step 8. Extract the corresponding character from each character image I_ci′ by the following steps.

(a) Compute the the similar degree sd^ij between each I_ci′ and each reference character image I_cj according to (27).

(b) Select the character with the largest similarity sdij as the message delivered by I_cj'.

Step 9. Concatenate the extracted characters to get the embedded message M.

5.5 Experimental Results

The proposed system was developed using Microsoft C#.NET and the generated message-rich character images were captured with an iPhone 4S. The character image database includes the printable characters of the ASCII codes. A series of experiments using different parameters have been conducted and corresponding statistics plotted (in Figure 5.8) to show the accuracy rates of extracted message characters, including:

(1) the contrast threshold  of the minimum difference between r1 and r2; and (2) the number of blocks N^S in the horizontal or vertical direction of the message image.

Figures 5.7(a) through 5.7(c) show three test target images used in the experiments. The corresponding message-rich character images generated with parameters N^S = 32 and  = 40 are shown in Figures 5.7(d) through 5.7(f). These images were all printed to be of the same size of 127127 mm.

One of the parameters that influence the accuracy of the extracted message is the contrast threshold  for the minimum difference between r1 and r2. If  is too small, the two representative values r1 and r2 will be too close so that the message extracted might be wrong. For example, Figure 5.8(a) shows the accuracy rate of the extracted messages with  = 0, 20, 40, and 60, from which it can be seen that the larger the

value of , the higher the accuracy rate of the extracted message; when  is larger than 40, an accuracy rate of 100% is yielded.

(a) (b) (c)

(d) (e) (f)

Figure 5.7. Created message-rich character images. (a)-(c) Test target images. (d)-(f) Resulting message-rich character images with N^S = 32 and = 40.

It can also be seen from Figure 5.8(b) that the larger the value of , the larger the RMSE of the resulting message-rich character image with respect to the target image.

So there is a tradeoff between achieving higher message extraction accuracy and obtaining a better visual quality in the resulting message-rich character image.

Another parameter that influences the accuracy of the extracted message is the number NS of blocks in the horizontal or vertical direction in the message image. The larger the value of NS, the larger the character capacity of the message image;

however, the larger the value of NS, the smaller the size of the block, and so the lower the accuracy of the extracted message, as can be seen from Figure 5.8(c).

Finally, to show the robustness of the proposed method, we have conducted some attacks on the created message-rich character images. For example, Figures 5.9(a) and 5.9(b) show two attacked versions of the message-rich character image shown in Figure 5.7(f) with message “ComputerVisionLab” injected. The experimental results show that the carried message can still be extracted from either of

these two attacked images. In addition, by regarding image taking from a display screen as a type of attack, a third attacked version so acquired is shown in Figure 5.9(c). The resulting message extraction rate is 96.88%, which, though not 100%, means that the proposed method can handle message carriers other than paper copies.

(a)

(b)

(c)

Figure 5.8. Plots of trends of results using various parameters. (a) Accuracy rates of extracted messages with different contrast threshold values , with #blocks N^S = 16. (b) RMSE values of created message-rich character images with respect to target images for different contrast threshold values of , with

#blocks N^S = 16. (c) Accuracies of extracted messages with different

#blocks N^S, where contrast threshold value  = 40.

(a) (b)

(c)

Figure 5.9. Robustness of proposed method. (a) A captured message-rich character image under defacement attack. (b) A captured message-rich character image under another defacement attack. (c) A message-rich character image captured from a monitor screen.

5.6 Summary

A new data hiding technique for AIDC applications via message-rich character image has been proposed, which is created from a target image for use as a carrier of a given message. The artistic favor of the target image is kept in the created image, achieving pervasive communication. Comparing with other AIDC tools like QR codes and hardcopy image barcodes, the proposed message-rich character image has several merits: (1) the image can not only be printed on papers but also be displayed on screens for various uses; (2) the image can endure more distortions like perspective transformation, noise, screen blurring, etc.; (3) the message can be extracted from an image captured by a mobile phone (this is not the case for the hardcopy image barcode [17]-[19]); (4) by utilizing the power of OCR, the image can endure more

serious attacks, such as partial defacement, image taking from screens, etc. (again, this is not the case for the hardcopy image barcode); (5) if message extraction from the message image by machine is not necessary to carry out, humans can still read the information appearing in the extracted message image because it is composed of characters, and so meaningful and readable. Experimental results show the feasibility of the proposed method.

Chapter 6 A New Data Hiding Technique via Message-rich code Image for Automatic Identification and Data Capture Applications

6.1 Introduction

In Chapter 5, we proposed a new data hiding technique via message-rich character image for automatic identification and data capture to realize pervasive communication on hard copies of images. It is created from a target image used as a carrier of a given message by fragmenting the shapes of the composing characters of the message and “injecting” the resulting character fragments randomly into the target image by a block luminance modulation scheme. Each message-rich character image so created has the visual appearance of the corresponding pre-selected target image while conventional barcodes [20]-[23] do not. Also, the data embedded by the method presented in Chapter 5 can be extracted from a “camera-captured” version of the created message-rich character image while those embedded by the use of the aforementioned hardcopy data hiding methods using image barcodes cannot. The function may be implemented on a mobile device.

However, as shown in Figure 6.1(b), each message-rich character image generated by the method in Chapter 5 contains many small character fragments with undesired visual effects. Also, it requires an optical character recognition scheme to extract the embedded message, which is usually time-consuming. Also, the size of each block cannot be too small in order to keep the resolution in the captured image sufficiently good for correct extraction of the character shapes in the image. To solve these problems, another new type of message-rich multimedia, called message-rich code image, is proposed in this study. Specifically, instead of transforming the given message to be embedded into a character message image, the message is converted, in the sense of data coding, into a bit stream of codes first, which is then represented by binary pattern blocks, each being composed of 2×2 unit blocks. A block luminance modulation scheme is then applied to each pattern block to yield a message-rich code image with the visual appearance of a pre-selected target image. An example of the

resulting message-rich code image is shown in Figure 6.1(c), which is more pleasing than the message-rich character image shown in Figure 6.1(b) generated by the method in Chapter 5 . A more detailed comparison with the method in Chapter 5 by experiments reveals the following additional merits of the proposed method in this chapter: (1) the yielded message-rich code image has a much better visual appearance of the target image; (2) the accuracy rate of message extraction from the generated code image is higher; (3) the message extraction speed is higher.

(a) (b) (c)

Figure 6.1. Examples of message-rich images yielded by the method in Chapter 5 and proposed method. (a) Target image. (b) Message-rich character image created by the method in Chapter 5. (c) Message-rich code image created by proposed method.

6.2 Idea of Proposed Method

The proposed method includes two main phases of works as illustrated in Figure 6.2: 1) message-rich code image generation; and 2) message extraction. In the first phase, given a target image IT and a message M, a message-rich code image IC is created by four major steps.

Stage 1-1 － transform message M into a bit stream B of codes;

Stage 1-2 － transform every three bits of B into four bits and represent them by a binary pattern block, resulting in a pattern image IP;

Stage 1-3 － modulate each pattern block Ti of IP by two representative values calculated from the Y-channel values of the corresponding block Bi of target image IT, yielding a modulated pattern image IP';

Stage 1-4 － replace the Y-channel of target image IT with IP' to get a message-rich code image IC as the output.

In the second phase, given a camera-captured version IC' of a paper or display copy of the message-rich code image IC, the message M is extracted from IC' by four major steps.

Stage 2-1 － localize the region IC'' of the original part of the message-rich code image IC in IC';

Stage 2-2 －correct the geometric distortion in IC'' incurred in the image acquisition process;

Stage 2-3 － identify the unit blocks in IC'' automatically and divide IC'' accordingly into pattern blocks, each with 22 unit blocks;

Stage 2-4 － binarize each pattern block of IC'', recognize the result to extract the bits embedded in it, compose all the extracted bits to form a bit stream B, and transform B reversely to get the message M.

Figure 6.2. Illustration of major steps of two phases of proposed method.

6.3 Generation of Message-rich Code Image

6.3.1 Pattern image creation

Unlike the method in Chapter 5 that transforms a message M into a character message image, the proposed method in this chapter transforms M into a bit stream B of codes, uses binary code patterns to encode the bits of B, and composes the code

patterns, each in the form a pattern block, to form a pattern image similar in appearance to a pre-selected target image. Specifically, each pattern block T consists of several unit blocks Fi, with each Fi representing a bit of the code pattern C which T represents. A main issue here is how to design the code patterns so that the corresponding pattern blocks are suitable for use not only in message embedding but in block luminance modulation (see Stage 1-3 in this chapter above). To solve this issue, two characteristics must be provided in the designed code patterns: 1) the number of bits in each code pattern C must be small enough, so that the pattern block T representative of C can keep the local color characteristic of the corresponding target image area; and 2) the colors of the unit blocks Fi of the pattern block T representative of each code pattern C should not be all the same, since this will cause the original bits represented by the unit blocks of the code patterns undistinguishable during message extraction.

The first characteristic mentioned above is necessary for the resulting message-rich code image to become more similar to the pre-selected target image.

And as an illustration of the necessity of the second characteristic, Figure 6.3 shows an example of undistinguishable binary code patterns, where the unit blocks Fi of the pattern block T representative of a code pattern C with bits “0000” are all of an identical color originally and are modulated to be all of another color, but then in the message extraction stage, the bits represented by the modulated pattern block cannot be extracted since only one color exists in this modulated pattern block and the bits corresponding to this color cannot be uniquely determined (more details discussed later).

Figure 6.3. An example of undistinguishable binary code patterns.

Therefore, in this study each pattern block representative of a code pattern is set to be of the smallest size of 2×2 unit blocks. Also, a novel bit expansion scheme is proposed to expand every three bits of the bit stream B into four ones which are not all

the same in order to satisfy the second required characteristic of the code pattern. In detail, let the bit stream B be denoted as

B = b11b12b13b21b22b23b31b32b33 …bn1bn2bn3;

and for every three consecutive bits bi1bi2bi3 in B, we perform at first a bit expansion operation to get four bits bi1′bi2′bi3′bi4′ by the following rule:

set bi4′ = b_i₁b_i₂b_i₃ and bij′ = bij for j = 1, 2, 3, (28) where  and  denote bitwise “OR” and “complement” operations, respectively.

The resulting four bits bi1′bi2′bi3′bi4′ will not be all identical, as can be verified by ORing the four bits bi1′ through bi4′, leading to the following result:

bi1′bi2′bi3′bi4′ = (bi1bi2bi3)bi4′ combinations are taken as the code patterns which we mentioned previously.

Message bits

Figure 6.4. Performing proposed bit expansion scheme on every three message bits to yield eight binary code patterns represented by pattern blocks.

Next, we create a 2×2 pattern block Ti = Fi1Fi2Fi3Fi4 with four unit blocks Fi1

through Fi4 to represent the non-all-identical bits bi1′ through bi4′ of each code pattern Ci, where the color of unit block Fij is set to be black if bij′ = 0 or white if bij′ = 1.

Accordingly, as can be seen from Figure 6.4, the colors of the pattern blocks representative of the eight code patterns are all non-identical as well.

Finally, we create a pattern image IP of the size of the target image IT by arranging all the pattern blocks Ti, say n ones, in a raster-scan order. If the n pattern blocks do not fill up IP, then we repeat to fill them into IP again and again until they do.

For example, with the target image IT as shown in Figure 6.5(a) and the bit stream B =

“110110110100011111010111001...,” the pattern image IP resulting from such filling operations is shown in Figure 6.5(b).

(a) (b) (c)

(d) (e) (f)

Figure 6.5 Message-rich code image generation. (a) Target image. (b) Pattern image IP. (c) Y-channel of (a). (d) Modulated pattern image. (e) Zoom-out of red square region in (d). (f) Resulting message-rich code image.

6.3.2 Block luminance modulation

After the pattern image IP is created, it is “injected” into the target image IT

under the constraint that the resulting image retains the visual appearance of IT. For

this, we utilize a characteristic of the YCbCr color model to embed IP into the Y-channel of IT. A block luminance modulation technique is used as the same way in Section 5.3.2, which modulates the mean of each pattern block Ti to be the same as that of the corresponding target block Bi of IT. The resulting modulated pattern image IP′ so has roughly the visual appearance of the Y-component of the target image IT. For example, Figure 6.5(d) shows a modulated pattern image IP′ so created, which looks like the Y-component of the target image IT shown in Figure 6.5(c); and Figure 6.5(e) shows a zoom-out version of part of Figure 6.5(d) enclosed by the red rectangle.

The details for block luminance modulation is omitted here, where the detailed steps for block luminance modulation is the same as those in Section 5.3.2. After the pattern image IP is modulated, the overall gray appearance of the modulated pattern image IP' and that of the Y-component of IT is roughly the same. Accordingly, we replace the Y-component of IT with IP′ to generate finally the desired message-rich code image IC which has the visual color appearance of IT, as shown by the example seen in Figure 6.5(f).

Later, when conducting message extraction, the message bit stream can be extracted from the Y-component of a captured version of IC by classifying the pixels of each pattern block into two classes according to their Y values: black and white.

However, an issue may occur here as the same one descried in Section 5.3.2: if the two representative values r1 and r2 are too close, it will be difficult to “separate” them in the classification process. Therefore, an adjustment of the representative values r1

and r2 is conducted, resulting in r1' and r2', so that the absolute difference between r1' and r2' becomes not smaller than a pre-defined contrast threshold   0. For example, Figure 6.6 shows a pattern block resulting from modulations with different values of , from which one can see that the two colors in a modulated pattern block will be more easily distinguished when  is larger. The detail of the proposed representative-value adjustment scheme can be found in Section 5.3.2 so it is omitted here.

6.3.3 Algorithm for message-rich code image creation

Based on the above discussions, a detailed algorithm for message-rich code image creation is described as follows.

Algorithm 6.1. Message-rich code image creation.

Input: a target image IT, a message M, and a contrast threshold value .

(a) (b) (c) (d) (e) (f) Figure 6.6. Modulated pattern block resulting from uses of different contrast threshold

values of  for the absolute difference between the two adjusted representative values r1′ and r2′. (a)  = 0. (b)  = 5. (c)  = 10. (d) = 20.

(e) = 30. (f) = 40.

Output: a message-rich code image IC. Steps:

Stage 1 － transforming the message into a bit stream.

Step 1. Transform message M into a bit stream B.

Stage 2 － generating the pattern image.

Step 2. Split B into n three-bit segments as b11b12b13b21b22b23 … bn1bn2bn3.

Step 3. Expand every three bits bi1bi2bi3 in B into four bits bi1′bi2′bi3′bi4′ according to (28) and generate the corresponding pattern block Ti according to the rules shown in Figure 6.4.

Step 4. Align all the generated pattern blocks Ti in a raster-scan order to form a pattern image IP of the size of target image IT, with each side having NT

patterns; and if the result does not fill up IP, repeat the filling until it becomes so.

Stage 3 － modulating the pattern image.

Step 5. Divide the Y-component of target image IT into target blocks {B1, B2, B3, …, BN} where N = NT×NT.

Step 6. For each pattern block Ti in pattern image IP, generate a modulated pattern block Ti′′ as follows.

(a) Compute two representative values r1 and r2 of the corresponding target block Bi according to (14).

(b) Compute o = |r2 r1|, and use it and the input contrast threshold  to obtain two adjusted representative values r1′ and r2′ from r1 and r2 according to (20) and (23).

(c) For each pixel Pt in Ti, if Pt is black, set the value pt'' of the corresponding pixel Pt'' in Ti'' as pt′′ = r1′; else, set pt′′ = r2′.

Step 7. Compose all the resulting Ti'' to get a modulated pattern image, denoted by IP′.

Stage 4 － injecting the pattern image into the target image.

Step 8. Replace the Y-component of IT with IP′ to generate the desired message-rich code image IC as the output.

6.4 Message Extraction

6.4.1 Localization of message-rich code image and inverse perspective transform

The localization scheme is the same as the one utilized in the previous method, where the description of the localization scheme can be found in Section 5.4.1. In short, we apply the Hough transform and polygonal approximation to find the largest non-white quadrangle Q in Id as shown by the example seen in Figure 6.7(a). Next, an inverse perspective transform is performed on Q to correct the distortion. The result of conducting this on Figure 6.7(a) is shown in Figure 6.7(b). Finally, the Y-component of the resulting Q is taken as an intermediate result, which we call the captured modulated pattern image and denote it by IP′′.

Q

(a) (b)

Figure 6.7. Localization and correction of perspective distortion in captured message-rich code image. (a) Localized message-rich code image portion (enclosed by red rectangle). (b) Result of perspective distortion correction applied to red portion region in (a).

6.4.2 Block number identification and block segmentation

在文檔中「富含訊息多媒體」 – 一種普及溝通之新工具 (頁 115-0)