Chapter 2 Integrity Authentication of Grayscale Document Images Surviving
2.1.2 Properties of Document Images Attacked by Print-And-Scan
If an image suffers from print and scan operations, there are two categories of distortions, namely geometric transformations and pixel value distortions. Geometric
13
transformations include translation, rotation, cropping and scaling. And distortions of pixel values are caused by (1) luminance, contrast, gamma correction, and chrominance variations, and (2) blurring of neighboring pixels. These are typical effects of printers and scanners, and while they are perceived by human eyes, they affect the visual quality of a rescanned image [9]. Geometric transformations do not cause significant effects on the visual quality but the pixel value distortions do. Figure 2.1 shows an original image and a rescanned version of it.
If we want to design a method for image authentication, embedding watermark signals is a way to achieve this goal. And the embedded authentication signal must have certain degrees of robustness against pixel-value distortions and geometric operations. In order to embed authentication signals in a grayscale document image against print-and-scan operations, invariant features of images with respect to geometric transformations should be adopted. Therefore, it’s better to use semi-fragile watermarks to embed authentication signals.
2.1.3 Properties of Grayscale Document Images
Document images are those coming from scanning printed or typewritten documents. A feature of document images is that there are many huge white blocks in background, so if we modify the gray values of pixels in the background, it is easy to be perceived. Another feature of document images is that it is usually text-dominated and reveals clear contrast between the background and the foreground. Because of pure color distribution, image processing on document images is easy to be noticed.
A grayscale image has only one channel, the gray channel. Each pixel value of this channel is an integer between 0 and 255. In the proposed method, we focus on grayscale document images, and so how to embed data in the single channel of a
14
grayscale document image is the main issue in this chapter.
(a) (b)
(c) (d) Figure 2.1 A grayscale document image and a reproduced image. (a) A grayscale
Chinese document image. (b) A grayscale English document image. (c) Reproduced image of (a) with quality of 100dpi. (d) Reproduced image of (b).
15
2.2
2.3
Idea of Proposed Authentication Method
In the procedure of embedding authentication signals, a document image is first divided into non-overlapping blocks. Different processing an image in the unit of a constant block, we treat a character or a word as a basic block by a connected component merging technique. Second, to each word block, we assign a number index and then embed in it a line as a semi-fragile watermark by decreasing the gray values of these pixels in it. Coefficients of a line equation are created by (1) a secrete key, (2) an RHG value [10] which is used to assign a gray value G to a binary image block, and (3) the block number index, in order to enhance the security of authentication.
As for the extraction of authentication signals, the pre-processing procedure of acquiring blocks is the same as the embedding one. And then we extract the least gray value of pixels in each block and apply a line fitting technique to obtain an equation of a line. In addition, we calculate another equation of a line by the key, the RHG value and the number index for each block. By comparing the difference between embedded line and calculated one in each block, we can verify the integrity of a grayscale document image.
Authentication Signal
Generation and Embedding
In order to generate authentication signals for a grayscale document image, in our method, it is needed to do some pre-processing for the sake of reducing distortion.
So there are two stages of tasks in our method, which are the pre-processing stage and the authentication signal embedding stage.
16
2.3.1 Pre-processing Stage
A. Bi-level thresholding:
The first step in the pre-processing stage is to remove noise and distortion. We apply a bi-level thresholding to increase the sensitivity of a region growing technique applied later. We set a threshold value to divide 256 pixel values into 2 pixel values, 0 and 255, and the corresponding pixels may be called black and white ones, respectively.
B. Division of image into blocks by connected component merging:
If we process an image in terms of blocks of a fixed size, the blocks will be changed after the image suffers from scaling or shrinking. So, it is not suitable to utilize blocks of a fixed size to process an image against scaling. We utilize a technique of connected component merging or the so-called region growing in the data embedding and extraction processes to determine the size of a block, so the blocks defined in the data extraction and embedding processes have identical ranges.
Region growing is a procedure that groups pixels or subregions into large regions based on predefined criteria. The basic concept is to start with a set of “seed” points and from them grow regions by appending to each seed those neighboring pixels that have properties similar to the seed [5]. Figure 2.2 shows an example of a character segmented as a basic block by the region growing method.
17
(a)
(b)
(c) (d) Figure 2.2 An example of using region growing technique to get basic blocks. (a)
Several Chinese characters. (b) Each character becomes a block (c) Several English characters. (d) Each character becomes a block.
C. Merging blocks:
The objective of merging blocks is to merge overlapping or neighboring smaller blocks into a larger one. If an image suffers from enlarging, gaps in characters will be enlarged. This means that a block may be divided into several parts and the total number of blocks after region growing will be different from the original one. So it is needed to devise a technique to solve this problem caused by image enlarging. We use a block merging technique to solve this problem. Two cases need be treated here.
Case1: Several blocks are overlapping:
If a block b1 and another block b2 are overlapping, then we merge the two blocks to establish a new one.
Case2: Several blocks are neighboring:
If the distance between the center of a block b1 and the center of another
18
block b2 are smaller than a threshold Ti, then we merge the two blocks into a new one. Figure 2.3 shows an example in this case. Figure 2.3 (a) is an image and its blocks acquired after region growing and (b) an image suffers from enlarging operations and its blocks acquired. As we can see, if an image suffers from scaling, the total number of blocks we obtain will be different from that of the original image.
In Figure 2.3(c), c1, c2, c3 and c4 are the center of blocks 1, 2, 3, 4, respectively; d2, d3 and d4 are the distance between the center of the first block c1 and c2, c3 and c4, respectively. If d2, d3 or d4 are smaller then Ti, then we merge the two blocks. After merging blocks, the total number of blocks is identical to the original one. Figure 2.3 (d) shows an example after merging blocks. Figure 2.3(e) and (f) show another example of merging blocks.
(a) (b)
c1
c2 c3
c4 d4 d3 d2
(c) (d)
(e) (f) Figure 2.3 An example of block merging. (a) A Chinese document image and the
total number of blocks is 2. (b) An enlarged image of (a) and the total number of blocks is 4. (c) Block distances of (b). (d) The result image of (b) after block merging. (e) An English document image and the total number of blocks is 12. (f) The result image of (e) after block merging.
19
D. Assignment of a number index to each block:
The reason of assigning each block a number index as a parameter to establish an equation of a line for embedding is to increase the security of authentication. In our method, after collecting all image blocks, we give a number index to each block. We do this for all blocks from the top-left to the bottom-right of the image.
E. Increasing the gray values of total black pixels:
Because the equation of the embedded line in each block is to modify the gray values of the black pixels which the embedded line has passed through to 0, in order to distinguish the embedded black pixels from original black pixels, we need to increase the gray values of total black pixels to a threshold Ts.
2.3.2 Creation and Embedding of Semi-Fragile Authentication Signals by Line Embedding
The objective of the pre-processing stage is to decrease created distortions. In this section, we describe how to create and embed authentication signals into each block. The technique we describe below is the core skill. The main idea is to embed a value as a semi-fragile watermark into each block. And the value is the slope of a line.
For the purpose of increasing the robustness, we will choose the best position to embed an authentication signal into each block. It seems a better choice to consider embedding data in black pixels in each block. After choosing the best position to hide an authentication signal in each block, we embed the slope of a line into each block by modifying the gray values of the black pixels through which the embedded line passes.
The detail of semi-fragile watermark embedding is described below.
20
A. Acquiring the equation of the embedded line:
In order to enhance the security of authentication, we use a key, the RHG value and the block number index as parameters to build up the equation of the embedded line. An equation of a line is as follows:
y = mx + b (2.1)
where x denotes a value of an x-coordinate, and y means similarly, m is a slope of a line, and b is a constant which means the shift of the y-axis.
In our method, the slope of the line m is the main coefficient to control the slope of the embedded line. We create the value of m in terms of three elements: a key, the RHG value, and the block number index. After m is determined, b is used to adjust the shift of the embedded line to reduce the awareness by human eyes.
(1) A key:
A key held by the sender and the receiver is used to enhance the security of authentication as mentioned previously. It can be promised that even the algorithm is known by a thief, without a correct key he/she can not produce the authentication signals to cheat the algorithm during the authentication process.
(2) The RHG value:
The RHG value aims to assign a gray value G by the following reduced halftone gray function:
( )
= T - B ×
G level
T (2.2)
where level means to divide total pixels into level parts, T is the total number of pixels and B is the number of black pixels. Equation (2.2) was proposed by Huang and Tsai
21
[10]. Because the RHG is based on the use of blocks of a fixed size and can not be applied to our method directly, we revise it to meet our goal of allowing the use of arbitrary-sized blocks.
(3) The block number index
We assign each block a number index to represent it. The numbers are assigned in a raster scan order. The block number index is also a key parameter to build up the equation of a line. The reason is to avoid malicious attacks by altering the positions of blocks.
After computing the three main elements, we compute the slope of the embedded line by the following equation:
m = f(key, RHG value, block number index) (2.3) Because the size of a block is not infinite and the capacity to embed the slope m into each block is restricted, the slope m of the embedded line can not be too large. As a result, we need to limit the range of m. In our method, the function f is described as follows:
m = (key+ RHG value + block number index )% range (2.4) where range is used to control the range of m. By modifying the slope m of the embedded line, we can embed an authentication signal into each block.
B. Finding the best position to embed a line
After calculating the slope of the embedded line, it is needed to find the best position to embed a line. A technique we use here is to adjust the constant value b of the equation of the line described in (2.1) to seek the best position to embed a line which arouses the least awareness. We shift the position of the embedded line by
22
modifying the constant value b.
In our method, the black pixels with gray value Ts in each block are used to carry an authentication signal by modifying their gray values to 0. And the selection of black pixels for embedding an authentication signal is by checking whether the embedded line passes through. Because it is hoped that the authentication signal have a certain degree of robustness, we choose more places to embed it to increase the robustness. So, we find the position with the largest number of lining-up black pixels which the embedded line passes through to embed the authentication signal. The position of the most number of lining-up black pixels to embed the line is selected by modifying b and can be seen as the best position with the best robustness.
Figure 2.4 shows an example of selecting the best position to embed a line. In this example we set m in Equation (2.1) to be m =1. Figure 2.4(a) shows a block after applying a region growing technique, and (b) is an example of shifting b to seek the best position to embed the line. After adjusting all possible values of b, we can find the best position to embed the line, as shown in Figure 2.4(c). Figure 2.4(d) shows that after selecting the value of b, we modify the gray values of the black pixels through which the embedded line passes in the block.
(a) (b) Figure 2.4 An example of finding the best position to embed a line. (a) A character.
(b) and (c) Shifting b to seek the best position to embed the authentication signal. (d) Modifying the gray values of the black pixels in the character.
23
(c) (d) Figure 2.4 An example of finding the best position to embed a line. (a) A character.
(b) and (c) Shifting b to seek the best position to embed the authentication signal. (d) Modifying the gray values of the black pixels in the character (continued).
C. Embedding a line into each block
In order to increase the robustness, we select the best position to embed a semi-fragile watermark in each block. Because all the black pixels in a document image raise the gray values to Ts during the pre-processing stage, we can distinguish the authentication signal embedded in the black pixels in each block from the original black pixels by modifying the gray values of the black pixels which are selected for embedding the authentication signals. By decreasing to 0 the gray values of the black pixels through which the line has passed, we can embed a line into each block. So, during the authentication process, we only need to extract the least gray values of pixels in each block to recover the equation of the embedded line. Figure 2.5 shows an example of this step.
24
Embedded line
(a) (b) Figure 2.5 An example of embedding a line into a block. (a) A block. (b) The result
after embedding a line in (a).
2.3.3 Detailed Algorithm
The inputs to the proposed method for embedding authentication signals include a grayscale document image I and a key K. The output is a stego-image S. The algorithm for the process can be briefly expressed as follows. Figure 2.6 shows a flowchart of the process.
Algorithm 1: Authentication signal embedding process.
Input: A given grayscale document image I and a key K used in the authentication signal embedding process.
Output: A stego image S.
Steps:
1 Pre-Processing of a document image I:
1.1 Apply bi-level thresholding to I using a threshold Ti.
1.2 Divide I into character blocks by a connected component merging
25
technique.
1.3 For each block Di, merge overlapping or neighboring smaller blocks into a larger one.
1.4 Assign each block Di a number index Si in a raster scan order from top left to bottom right.
1.5 Increase the gray values of all the black pixels in I to be Ts. 2 For each block Di, build up the equation of a line to be embedded.
2.1 Assign the RHG value according to (2.2).
2.2 Use K, Si and the RHG value to calculate a slope m of a line according to (2.3).
2.3 Find the best position to embed the line by shifting a constant of b.
2.4 Embed the line into Di by modifying the gray values of the black pixels through which the line passes.
3 Take the final result as the desired stego-image S.
2.4 Image Authentication process
In the embedding process, the embedded authentication signal is the line created by the key, the RHG value, and the number index. Therefore, we can judge an image in suspicion as being tampered with or not by checking the difference of authentication signals between the generated slope m and the extracted slope m’.
2.4.1 Extraction of Authentication Signals Using A Line Fitting Technique
The proposed method for image authentication is essentially similar to the embedding one but in a reverse order. A suspicious image is first divided into non-overlapping
26
blocks by a connect component merging technique, and then merging neighboring or overlapping smaller blocks into a larger one similar to the pre-processing of the embedding process. For each block, we collect the least gray values of the pixels and apply a line fitting technique to extract the authentication signals.
Figure 2.6 Flowchart of proposed method for authentication signal embedding in grayscale document images.
27
A. Applying a line fitting technique to extract the embedded line.
During the embedding process, modifying to be 0 the gray values of the black pixels which the line has passed through is the core technique to embed a semi-fragile watermark into each block. So, during the image authentication procedure, all we need to do is to collect the least gray values of the pixels in each block and extract the embedded line by a line fitting technique which is described as follows:
1 1 1 used for decreasing the awareness by human eyes and is irrelevant to the embedded authentication signals, so there is no need to calculate this constant.
We still have to collect three coefficients to get m according to (2.4) which are the key, RHG value, and the number index. By comparing the difference between m and m’, we can judge an image in suspicion as being tempered with or not in each block.
Figure 2.7 is a flowchart of the proposed method for image authentication.
Algorithm 2: Image authentication process.
Input: A given stego-image S and the key K identical to that used in the embedding process.
Output: An authentication image A.
Steps:
1 Pre-Processing of a document image S
28
1.1 Divide S into character blocks by a connected component merging technique.
1.2 For each block Di, merge overlapping or neighboring smaller blocks into a larger one.
1.3 Assign each block Di a number index Si. 2 For each block Di, perform the following operations.
2.1 Collect the least gray values of pixels, and apply line fitting to the pixels to get m’ according to (2.5)
2.2 Assign the RHG value according to (2.2)
2.3 Use K, Si, and RHG to calculate the slope m of the line according to (2.4).
2.4 if m≠m’, then regard Di as being tampered with and mark the block at the same location in A with red color.
3 Take the final result as the desired authentication image A.
A stego-image
Divide into character blocks by a connected component
merging technique
Merge overlapping or neighboring smaller blocks into
a larger one
Assign each block the number index
Use the key, RHG and the number index to create m Key
Each block
Collect the least gray value of pixels and apply a line fitting
technique to get m’
Each block
Compare m with m’
The bock is judged as being altered
The block is judged as being not altered
Unmatched Matched
Figure 2.7 Flowchart of proposed method for image authentication
29
2.5 Experimental Results
Some experimental results of applying the proposed method are shown here.
Figure 2.8(a) and (b) are two grayscale Chinese and English document images, respectively, both with size of 400 × 500. And the stego-images resulting from embedding authentication signals are shown in Figure 2.8(c) and (d), respectively.
Figure 2.8(e) and (f) are two stego-images suffering from print-and-scan operations,
Figure 2.8(e) and (f) are two stego-images suffering from print-and-scan operations,