Possible extensions for the proposed method using natural language processing

Chapter 4 A New Data Hiding Technique via Revision History Records on

4.5 Security Consideration

4.5.3 Possible extensions for the proposed method using natural language processing

For the ability of constructing the collaborative writing database automatically and generalizing the proposed method for multi-language uses, four characteristics of collaborative writing as mentioned previously have been analyzed based on the assumption that only word sequence corrections will be made in a revision. However, the real collaborative writing process is much more complicated and language-dependent, so data hiding via collaborative writing is still worth intensive researches.

Many possible methods in natural language processing [57]-[59], [69] may be applied to extend the proposed method. For example, some original word sequences in an input cover document may be polysemous. Therefore, selecting appropriate word sequences from DBcw by the proposed method to replace such polysemous word sequences might constitute a meaningless context. One possible way out is to analyze the distributional similarity of word sequences [69] to find appropriate replacing word sequences that do not cause this problem, where distributional similarity means the similarity in the meanings of those words that have the same contexts in documents.

Moreover, we can also build language models [57]-[59], such as dependency trees used in grammatical analysis, to embed messages during revision generations based on the model.

4.6 Summary

A new data hiding method via creations of fake collaboratively-written documents on collaborative writing platforms has been proposed. An input secret message is embedded in the revision history of the resulting stego-document through a simulated collaborative writing process with multiple virtual authors. With this camouflage, people will take the stego-document as a normal collaborative writing work and will not be expected to realize the existence of the hidden message. To generate simulated revisions more realistically, a collaborative writing database was mined from Wikipedia, and the Huffman coding technique was used to encode the mined word sequences in the database according to the statistics of the words. Four characteristics of article revisions were identified, including the author of each revision, the number of corrected word sequences, the content of the corrected word

sequences, and the word sequences replacing the corrected ones. Related problems arising in utilizing these characteristics for data hiding have been solved skillfully, resulting in an effective multi-way method for hiding secret messages into the revision history. Moreover, because the word sequences used in the revisions were collected from a great many of real people’s writings on Wikipedia, and because Huffman coding based on usage frequencies is applied to encode the word sequences, the resulting stego-document is more realistic than other text steganography methods, such as word-shift methods [30], non-displayed characters based methods [31], synonym replacement methods [35]-[37], etc. The experimental results have shown the feasibility of the proposed method. Future works may be directed to analyzing more characteristics of collaborative writing works or establishing appropriate language models [57]-[59] for more effective data hiding or other applications.

Chapter 5 A New Data Hiding Technique via Message-rich Character Image for Automatic Identification and Data Capture Applications

5.1 Introduction

With the advance of technology, machines have long been used to read automatically information in the reality for various applications, like optical character recognition (OCR), license plate recognition, supermarket checkout systems, etc.

Recently, many more methods have been developed for this purpose, and they are collectively known as AIDC techniques [40]. The processed information is presented in various forms, some being visible (like barcodes) and others invisible (like watermarks hidden behind images). Such forms of multimedia, enabling the vision of pervasive communication, are termed integrally as message-rich multimedia in this study as mentioned previously.

One technique that realizes the use of message-rich multimedia for the AIDC purpose is barcode reading. Being attached to objects, barcodes represent machine-readable data by patterns of lines, rectangles, dots, etc. The data encoded into such barcodes can be extracted using barcode reading techniques [24]-[27]. But most types of barcodes, such as Code 39 [20], PDF417 [21], QR code [22], and Data Matrix code [23] shown in Figure 5.1, just encode information to yield unsightly images with no aesthetics. If a barcode contains not only the encoded information but also has a visual appearance of an art image, the artistic effect of the barcode will be more attractive than those of conventional ones.

Data hiding is an alternative pervasive communication technique for the AIDC purpose that embeds data into cover media for applications like covert communication, copyright protection, authentication, etc. With the advance of computer technology, many data hiding methods have been applied on digital cover media, such as images, videos, audios, text documents, etc. However, these data hiding methods transfer data via digital files only. Furthermore, they are mostly insufficient to enable the vision of

pervasive communication when one wants to interact with the surrounding environment. Such methods may be called “digital” data hiding.

(a) (b)

Figure 5.1. Examples of commonly-used barcodes. (a) Code 39. (b) PDF 417. (c) QR code. (d) Data matrix code.

Another type of data hiding, which may be called “hardcopy” data hiding, can embed information into the so-called image barcodes using halftone techniques [17]-[19]. These barcodes have the visual appearances of other images and the encoded information can be decoded from their hardcopy versions acquired by scanners. That is, the encoded information can survive print-and-scan “attacks.”

However, if one uses a mobile device to capture images of hardcopy image barcodes, the information might not be decoded successfully since the captured image will suffer from more types of distortions than those acquired by scanning, such as geometric deformation, noise addition, blurring, etc. Also, message carriers other rather than printed papers, such as screens on display devices, cannot be used to encode information since the halftone methods are based on the printing technique.

Instead, the method proposed in this study can decode the message which is carried in an image captured from a printed paper or a display screen using a mobile-device camera, achieving the effect of pervasive communication.

Specifically, a new data hiding technique via message-rich character image, which is created from an artistic target image for use as a carrier of a given message, is proposed. The image may be printed as a hardcopy for use of any purpose, which is then “re-imaged” by a mobile-phone camera and “understood” by some automatic

identification and data capture (AIDC) techniques [40] proposed in this study.

Message-rich character images may be of the forms of documents, labels, posters, etc.

Also, such images may have the visual appearances of artistic-flavored photos, pictures, paintings, which are more attractive to humans than those produced by conventional AIDC techniques, like barcodes, QR-codes, etc.

5.2 Idea of Proposed Method

The proposed message-rich character images not only includes the content of a given message, but also has an artistic effect of being visually similar to the pre-selected target image. The proposed method using the message-rich character images for AIDC purposes is illustrated in Figure 5.2, which includes two phases: character image Is is created by three steps: (1) transform M into a message image Im

consisting of the characters of the message content; (2) modulate the gray values of each character-fragment Fi of Im into two values calculated from the Y-channel values of the corresponding target block Bi of It, resulting in a modulated message image Im'; (3) replace the Y-channel of It with Im' to get the desired Is.

In the second phase, the message M is extracted from a captured version Is' of a paper of the printed message-rich character image Is by three steps: (1) localize and segment out the region Is'' of the original part of Is in Is'; (2) perform an inverse perspective transform to correct the geometric distortion in Is''; (3) identify the blocks in Is'', binarize them, and perform OCR to extract the message M from them.

5.3 Generation of Message-rich Character Image

5.3.1 Message image creation

Unlike most barcode systems that encode message contents by patterns (dots, lines, etc.), the proposed method converts a message M into a set of binary character shapes drawn from a database, as illustrated in Figure 5.3(a). Next, a message image IM of the size of the target image IT is created by aligning the character shapes plus an ending pattern as shown in Figure 5.3(b) in a raster-scan order. For example, with the target image IT as shown in Figure 5.3(c) and the message M = “ABCDEFGH…,” the resulting message image Im is as that shown in Figure 5.3(d). If the result cannot fill up IM, then repetitions of the character shapes are conducted.

5.3.2 Block luminance modulation

After the message image IM is created, it is “injected” into the target image IT

under the constraint that the resulting image retains the visual appearance of IT. For this, we utilize a characteristic of the YCbCr color model  the luminance component Y is independent of the others [70]  to embed IM into the Y-channel of IT. This will solve a problem of illumination variation encountered in the later stage of message extraction. A block luminance modulation technique is proposed here to divide the message image IM into character-fragments Fi and modulate the mean of each Fi to be the same as that of the corresponding target block Bi of IT. The resulting modulated message image IM′ looks like the Y-component of the target image IT. For example, Figure 5.3(f) shows the created modulated message image IM′ which looks like the Y-component of the target image shown in Figure 5.3(e), and Figure 5.3(g) shows a zoom-out of part of Figure 5.3(f) (the red portion).

More specifically, firstly the message image IM and the Y-component of IT are divided into character-fragments and target blocks, respectively, where the size of each block is 1/4 times of a character image. The character-fragments Fi then are

fitted into the target blocks Bi in a random way controlled by a key K. Secondly, let

This means that the overall gray appearances of the modulated message image IM' and the Y-component of IT are roughly the same, as already mentioned.

Accordingly, we replace the Y-component of IT with IM′ to generate finally the

desired message-rich character image IC which has the visual color appearance of IT, as shown by the example seen in Figure 5.3(h).

(a) (b)

(f) (g) (h)

Figure 5.3. Message-rich character image generation. (a) Image of character “T.” (b) Ending pattern. (c) Target image. (d) Message image. (e) Y-channel of (c).

(f) Modulated message image. (g) Zoom-out of red square region in (f). (h) Resulting printed message-rich character image.

Accordingly, later while conducting message extraction, the message characters can be extracted from the Y-component of a captured version of IC by classifying the pixels of each block into two groups according to their Y values, with the two pixel groups representing the character and non-character parts, respectively. Also, an OCR technique is applied to extract these characters. However, if the two representative values r1 and r2 are too close, it is hard to separate them in the message extraction phase. Therefore, an adjustment of the representative values r1 and r2 is conducted, resulting in r1' and r2', so that the absolute difference between r1' and r2' becomes not

smaller than a pre-defined contrast threshold   0. For example, Figure 5.4 shows four character-fragments resulting from modulations with different values of , from which one can see that the two colors in modulated character-fragments will be more easily distinguished when  is larger.

(a) (b) (c) (d) (e) (f)

Figure 5.4. Modulated character-fragments resulting from uses of different contrast threshold values of  for the difference between the two representative values r1 and r2. (a) = 0. (b)  = 10. (c)  = 20. (d)  = 30. (e)  = 40. (f) 

= 50.

The detail of the proposed representative-value adjustment scheme is described in the following. Note that, after the adjustment, the absolute difference between r1′ and r2′ must be not smaller than the contrast threshold , and that the mean of the modulated character fragment Fi'' based on r1′ and r2′ must be identical to that of the target block Bi. Thus, the values of r¹′ and r2′ must satisfy the following two constraints:

|r2′  r1′|  ; (18)

 _Fi'' = _Bi. (19)

Two possible cases can be identified in the adjustment process: 1) the original absolute difference o of r1 and r2 is already not smaller than , i.e., o  ; and 2) the reverse, i.e., o < . In the first case, the values of r1 and r2 satisfy (18) and (19) automatically, so that they may be used as r1' and r2', respectively, directly, i.e., we have the rule:

if o  , then set r1′ = r1 and r2′ = r2. (20) For the second case with o < , the absolute difference between the two representative values must be increased, after the adjustment, at least for the amount of   o for the resulting values of r1′ and r2′ to satisfy constraint (18). Specifically, let the adjustment value of r1 be t so that r1′ = r1  t. Then, the adjustment value of r2

should be at least (  o)  t so that r2′  r2 + [(  o)  t]. Such value adjustments Section 5.5. Then, the yet-unknown value t may be computed by:

1 the last step is based on the use of (19). Accordingly, we can get:

5.3.3 Algorithm for message-rich character image creation

Based on the above discussions, a detailed algorithm for message-rich character image creation is described as follows.

Algorithm 5.1. Message-rich character image creation.

Input: a target image IT, a message M, and a contrast threshold value .

Output: a message-rich character image IC. Steps:

Stage 1 － transforming the message into a message image.

Step 1. Convert M into a set of binary character shapes drawn from a database.

Step 2. Create the message image IM of the size of the target image IT by aligning the character shapes plus an ending pattern in a raster-scan order.

Step 3. If the aligning result in Step 2 cannot fill up IM, then repetitions of the character shapes are conducted.

Stage 2 － modulating the message image.

Step 4. Divide the Y-component of target image IT into target blocks {B1, B2, B3, …, BN} where N = NT×NT.

Step 5. For each fragment block Fi in message image IM, generate a modulated fragment block Fi′′ as follows.

(a) Compute two representative values r1 and r2 of the corresponding target block Bi according to (14).

(b) Compute o = |r2 r1|, and use it and the input contrast threshold  to obtain two adjusted representative values r1′ and r2′ from r1 and r2 according to (20) and (23).

(c) For each pixel Pt in Fi, if Pt is black, set the value pt'' of the corresponding pixel Pt'' in Fi'' as pt′′ = r1′; else, set pt′′ = r2′.

Step 6. Compose all the resulting Fi'' to get a modulated message image, denoted by IM′.

Stage 3 － injecting the message image into the target image.

Step 7. Replace the Y-component of IT with IM′ to generate the desired message-rich character image IC as the output.

5.4 Message Extraction

The various techniques proposed for extracting the message embedded in the message-rich character image are described first, with a combination of them described as an algorithm at last.

5.4.1 Message-rich character image localization and inverse perspective transform

Assume that the message-rich character image IC is printed and posted or displayed against a white background, and that the captured image Id contains only the original image of IC and the background. The first assumption here may be removed simply by adding a white surrounding zone to IC. To extract the message from Id, we must localize the region of IC in Id. For this, we apply the Hough transform and polygonal approximation to find the largest non-white quadrangle Q in Id as shown by the example seen in Figure 5.5(a). Also, image Id will suffer from perspective distortion if the axis of the camera is not directed perpendicularly toward the plane of the message-rich code image IC [27] during image acquisition, as seen in Figure 5.5 (a) as well. As a remedy, an inverse perspective transform is performed on Q to correct the distortion. The result of conducting this on Figure 5.5(a) is shown in Figure 5.5(b).

Finally, the Y-component of the resulting Q is taken as an intermediate result, which we call the captured modulated message image and denote it by IM′′.

(a) (b)

Figure 5.5. Localization and correction of perspective distortion in captured message-rich character image. (a) Localized message-rich character image portion (enclosed by red rectangle). (b) Result of perspective distortion correction applied to red portion region in (a).

5.4.2 Block number identification and block segmentation

To identify the blocks in IM′′ in order to binarize them and perform OCR to the contents of them, an idea similar to the Hough transform [71] is adopted: uses the statistics of the pixels’ gradient values to guess the number NS of blocks in the horizontal or vertical direction in IM'' because those pixels on the splitting lines between the blocks usually have larger gradient values.

In more detail, at first the gradient value gxy of each pixel Rxy with value rxy at coordinates (x, y) in IM′′ is computed by a Sobel operator [72]:

gxy = (r_x_+1,_y_₁2r_x__1,_yr_x__1,_y_₁) ( r_x__{1, -1}_y 2r_x__1,_yr_x__1,_y_₁) 

1, +1 , +1 1, +1 1, 1 , 1 1, 1

(r_x_ _y 2r_{x y} r_x_ _y ) ( r_x_ _y_ 2r_{x y}_ r_x_ _y_ ) . (24) Next, for each possible value nj of NS, the distance dj between the splitting lines of every two possible adjacent blocks is computed as dj = L/nj where Lis the side length of the square-shaped IM′′. Then, the horizontal or vertical lines separated by the distance of dj are taken as candidate splitting lines, where the positions of these candidate splitting lines described by image coordinates are computed by:

x = u×dj and y = v×dj, (25)

where u = 1 ~ L/dj and v = 1 ~ L/dj, respectively. Also, the average gradient value AGnj of the pixels on each candidate spitting line is computed as:

/ /

Finally, the value nj of NS with the largest average gradient value is taken as the desired number of blocks of IM′′ in the horizontal or vertical direction, and division of Im'' into blocks is conducted accordingly. For example, Figure 5.6(a) shows a captured modulated message image IM′′, Figure 5.6 (b) is the image of the computed gradient values, and Figure 5.6(c) illustrates the average gradient values for different NS, where the nj with the largest average gradient value is seen to be 16 (indicated by the orange arrow). Therefore, the found value for NSF is 16. The corresponding image division result is shown in Figure 5.6(d).

(a) (b) (c)

(d) (e) (f)

ABCDEFGHIJKLMNOPQRSTUVW XYZABCDEFGHIJKLMNOPQRST

UVWXYZABCDEFGHIJK

(g) (h)

Figure 5.6. Message extraction. (a) Captured modulated message image IM′′. (b) Gradient values of (a). (c) Average gradient values of pixels on candidate spitting lines for different NS. (d) Image division result according to determined number of blocks NS = 16. (e) Fragment reordering result of (d). (f) Binarization result of (e). (g) OCR result of (f). (h) Extracted message.

5.4.3 Binarization and optical character recognition

After the blocks of IM′′ are segmented out, the character-fragments of the message image IM may be recovered from the blocks by using the key K mentioned previously. Denote the resulting image by IM'''. Then, moment-preserving thresholding [73] is applied to Fi' to binarize it, and every four mutually-connected binarized blocks are grouped up to form a character image I_ci′ since a character image is divided into four blocks in the message image generation phase.

Next, a similar degree sd^ij between each I_ci′ and each reference character image pim}, respectively. Finally, each I_cj' is recognized using an OCR scheme according to the computed similarity degree values: the character with the largest similarity is taken as the message delivered by I_cj'. For example, the recovered original message image Im with its character-fragments reordered using a key is shown in Figure 5.6(e), which, after being binarized, results in Figure 5.6(f). The OCR result of Figure 5.6(f) is shown in Figure 5.6(g), and the final extracted message characters are shown in Figure 5.6(h).

5.4.4 Message extraction algorithm

A detailed message extraction algorithm is as follows.

Algorithm 5.2. Message extraction.

Input: a captured version Id of a message-rich character image.

Output: the message M embedded originally in Id. Steps:

Stage 1 － localizing the message-rich character image.

Step 1. Find the largest non-white quadrangle Q in Id by the Hough transform and polygonal approximation.

Stage 2 － correcting geometric distortion.

Step 2. Perform an inverse perspective transform on Q to correct the perspective distortion and take the Y-component of Q as the captured modulated message image IM′′.

Stage 3 － identifying blocks in the message image.

Step 3. Compute the gradient value gxy of each pixel Rxy in IP′′ according to (24).

Step 4. For each possible value nj of NS, compute the average gradient values AGn_j of the pixels on each candidate spitting line according to (26).

Step 5. Select the value nj yielding the largest AGⁿj for use as the desired number NS

of blocks of IM′′ in the horizontal or vertical direction; and divide IM'' accordingly into blocks.

Stage 4 － binarizing the blocks to extract the message.

Step 6. Binarize each block Fi' by moment-preserving thresholding [73].

Step 7. Group every four mutually-connected binarized blocks to form character images I_ci′.

Step 8. Extract the corresponding character from each character image I_ci′ by the following steps.

(a) Compute the the similar degree sd^ij between each I_ci′ and each reference

在文檔中「富含訊息多媒體」 – 一種普及溝通之新工具 (頁 101-0)