A secure data hiding scheme for binary images

(1)

A Secure Data Hiding Scheme for Binary Images

Yu-Chee Tseng, Member, IEEE, Yu-Yuan Chen, and Hsiang-Kuang Pan

Abstract—This letter presents a novel steganography scheme

ca-pable of concealing a piece of critical information in a host message which is a binary image (e.g., a facsimile). A binary matrix and a weight matrix are used as secret keys to protect the hidden infor-mation. Given a host image of size , the proposed scheme can conceal as many as log₂( + 1) bits of data in the image by changing, at most, two bits in the host image. This scheme can provide a higher security, embed more information, and maintain a higher quality of the host image than available schemes.

Index Terms—Cryptography, data hiding, prisoners’ problem,

security, steganography.

I. INTRODUCTION

T

HE increasing popularity of digital media has ushered in concern over security-related issues. The confidentiality of a document is typically achieved by encryption. However, as an encrypted message normally reveals the importance of its content, the ciphertext also attracts the interest of cryptanalysts.

Steganography differs from encryption, in that it embeds critical

information in a noncritical host message (e.g., webpages and advertisements) to distract opponents’ attention [8], [13]. There-fore, steganography is also known as data/information hiding.

The study of steganography can be traced to [10], in which the Prisoners’ Problem was proposed. In this scenario, Alice and Bob were in jail, and attempted an escape plan. However, all their communications must go through the warden, Willie. If detecting any encrypted messages, Willie would frustrate their plan by putting them into solitary confinement. Therefore, Alice and Bob must find a way to conceal their secret in an innocuous-looking covertext. The history and bandwidth concerns of the

subliminal channel are discussed in [11], [12].

Data hiding is typically achieved by altering some nonessen-tial information in the host message. For example, given a color image, the least-significant bit (LSB) of each pixel can be changed to embed the hidden secret [14]. A hiding scheme based on the conventional keystream generator is proposed in [5]. Information hiding for security documents (e.g., currency) is discussed in [6]. References [1] and [2] consider how to

Paper approved by K. Rose, the Editor for Source Channel Coding of the IEEE Communications Society. Manuscript received May 9, 2000; revised May 10, 2001, November 30, 2001. The work of Y.-C. Tseng was supported by Lee and MTI Center for Networking Research at NCTU, the Ministry of Educa-tion under Contract 90-H-FA07-1-4, and the NaEduca-tional Science Council, Taiwan, R.O.C., under Contract NSC89-2218-E-009-095. This paper was presented in part at the IEEE Symposium on Computers and Communications, France, 2000. Y.-C. Tseng is with the Department of Computer Science and Information Engineering, National Chiao Tung University, Hsin-Chu 30050, Taiwan, R.O.C. (e-mail: yctseng@csie.nctu.edu.tw).

Y.-Y. Chen and H.-K. Pan are with the Department of Computer Science and Information Engineering, National Central University, Chung-Li 32054, Taiwan, R.O.C. (e-mail: yychen@itri.org.tw; jerry@formosoft.com).

Publisher Item Identifier 10.1109/TCOMM.2002.801488.

apply public-key cryptography to steganography. Reviews of steganography are in [2], [3], and [7].

Hiding data in a binary image is a more challenging task, since changing any pixel can be easily detected. References [16] and [17] address this subject. The quality of the image, once data are concealed in it, is further considered in [15]. Watermarking on binary images is discussed in [9]. To improve the image hiding quality and hiding capacity, this letter presents a novel scheme capable of hiding a large amount of data by changing a small number of bits in the original binary image. Specifically, given an image block, the proposed scheme can conceal as many as bits of data in the block by changing, at most, two bits in the block. This approach is much more effi-cient than available schemes [15]–[17], which can hide, at most, one bit in each block by changing, at most, one bit in the block. Three aspects of the advantages of the proposed scheme should be considered. First, let us assume that the maximum number of pixels that can be modified in the binary image is fixed. We can consider an image block of size . The proposed scheme can hide bits by changing, at most, two bits in the image. In contrast, available schemes can partition the image into two blocks, each of size . Then, at most, two bits in the image can be modified to conceal two bits of data. Thus, the proposed scheme offers a data-hiding ratio that is times that of available schemes. Second, if, on the contrary, equalizing the amount of embedded data is desired, the image quality after modification will be significantly improved, since fewer pixels are altered. Third, due to the above reasons, the proposed scheme is more secure than available ones because the existence of hidden data is less detectable.

The rest of this letter is organized as follows. Section II presents our data-hiding scheme. Section III discusses our experimental results. Finally, conclusions are drawn in Section IV.

II. PROPOSEDDATA-HIDINGSCHEME FORBINARYIMAGES

The proposed scheme uses a binary matrix and an integer weight matrix as secret keys. The operatorXOR is adopted so that the keys can not be compromised easily. Another important function of the weight matrix is to increase the data-hiding ca-pacity. The inputs to our scheme are as follows.

1) is a host binary image (i.e., bitmap), which is to be modified to embed data. Here, is partitioned into blocks of size . For simplicity, we assume that the size of

is a multiple of .

2) is a secret key shared by the sender and the receiver. It is a randomly selected bitmap of size .

(2)

3) is a secret-weight matrix shared by the sender and the receiver. It is an integer matrix of size whose content satisfies some requirements (to be stated later). 4) is the number of bits to be embedded in each

block of . The value of satisfies . 5) is critical information consisting of bits to be

em-bedded in , where is the number of blocks in .

A. Weight Management

The proposed scheme uses the weight matrix to represent the embedded data. This section presents an illustrative example to demonstrate how to manage weights. Section II-B presents the complete scheme.

Assume that the size of and is 3 3. Below, we consider a 3 3 image block , which is a part of the host image . The purpose is to show how to embed bits of data in . Let us assume the following inputs:

First, a bitwise exclusive-OR on and is performed.

Next, let be the componentwise multiplication operator on two equal-size integer matrices. The following computation is conducted:

Summing all elements in the rightmost matrix yields .

Next, two data bits, denoted as , are to be embedded into . Assume that is transformed into . Regarding as a binary number, the proposed scheme will ensure the validity of the following invariant:

With this invariant, the receiver can derive by computing mod 4.

Next, modifying to ensure is demonstrated. The goal is to change as few bits in as possible. Since

, if, fortunately, , then does not need to be changed. Otherwise, some bit(s) must be modified. Observe that if we complement bit , then will be complemented. If is swapped from 0 to 1, then the modular sum will be increased by ; otherwise, the sum will be decreased by . For instance, if we swap , the sum will be decreased by , and if we swap , the sum will be increased by . In this example, it is not hard to verify that we only need to complement one bit in to increase or decrease the sum by 1, 2, or 3. How to ensure the success of the swapping process will be discussed later.

B. Hiding Steps

Definition 1: An matrix can serve as a weight matrix if each element of appears at least once in , i.e.,

.

The rationale behind this definition will become clear later. Note that it is trivial to find a legal because we have already imposed the condition that . In fact, many choices are available for choosing . Specifically, we can first pick

elements in matrix and randomly assign

to them. The remaining elements in can then be assigned with arbitrary values. Based on such an assignment, the number of choices for is

For instance, if and , there are

possible s. This number should be sufficiently large to prevent a brute-force attack.

Let be a legal weight matrix and be an image block, which is a part of . Below, we show how to embed bits of data, say , into by changing, at most, 2 bits in . The goal is to modify into to ensure the following in-variant:

Below, the embedding scheme is derived in four steps.

1) Compute .

2) Compute .

3) From the matrix , compute for each the following set:

Intuitively, represents the set containing each matrix index , such that complementing would increase the sum in step 2 by . Two possibilities to achieve this are if

and , then complementing will increase

the weight by ; and if and ,

then complementing will decrease the weight by , or, equivalently, increase the sum by (under mod ).

The following lemmas indicate some important properties of these sets. (Detailed proofs can be found in [4].)

Lemma 1: For each , such that ,

the following statement is true:

Lemma 2: The set .

4) Define a weight difference as

The sum in step 2 must be increased by to satisfy I2. If does not need to be changed. Otherwise, the following steps are executed to transform into

(3)

Fig. 1. Example of host imageF , secret key K, and weight matrix W .

(a) (b)

Fig. 2. (a)F 8 K. (b) Modified host image.

. For ease of presentation, let us define for any (mod ).

a) Randomly select an , such

that and .

b) Randomly select a and complement the bit .

c) Randomly select a and com-plement the bit .

Intuitively, to increase the sum by , two nonempty sets and can be selected. This is possible since these sets indicate the locations where can be complemented to in-crease the weights by and , respectively. Conse-quently, a total increase of the weight by is obtained.

However, the above steps are logically flawed, which was not mentioned intentionally for ease of presentation. In fact, the set (and similarly , etc.) is not yet de-fined. Similar to other s, can be regarded as the set of indices such that complementing these locations in will re-sult in an increase of the weight by 0. Since this can be achieved by changing nothing on can be regarded as a nonempty set. Whenever the statement “complement the bit ” is en-countered, this step is simply omitted. This amendment makes step 4 logically correct.

Finally, whether step 4 is successful depends on the success of step a) to identify a qualified . This is proved below.

Lemma 3: Step 4 always succeeds, and, at most, two bits of

are modified to embed bits of data.

The following example demonstrates how our scheme works. Let the host image be , secret key be , and weight matrix be , as shown in Fig. 1. First, is partitioned into four 4 4 blocks . Let , so we can embed 12 bits, say

, into .

TheXORresult of each with is in Fig. 2(a).

For . Since the embedded

data is 001, the weight must be increased by 1. Since and , we can complement . For

. Since the embedded data is 010, does not need to be modified. For

. Since the embedded data is 000, the weight must be increased by 6, which can be done by complementing .

For . Since the embedded data is

001, the weight must be increased by 5. There is no single point in which can accomplish this task. Therefore, two bits in must be changed. One possibility is and

. In this example, we choose to complement and . Fig. 2(b) displays the final modified image.

III. EXPERIMENTALRESULTS

Herein, Wu and Lee’s (WL) scheme [16] and the proposed scheme are implemented to visualize the data-hiding effect.

(4)

Fig. 3. Embedding effect on Chinese characters. (a) Original host image. (b) After embedding 1686 bytes by the proposed scheme with block size 82 8. (c) After embedding 357 bytes by WL scheme with block size 82 8. (d) After embedding 297 bytes by the proposed scheme with block size 32 2 32. (e) After embedding 29 bytes by WL scheme with block size 162 16. (f) After embedding 357 bytes by the proposed scheme with block size 28 2 28. (g) After embedding 357 bytes by WL scheme with block size 82 8.

Fig. 4. Embedding effect on English characters. (a) Original host image. (b) After embedding 1650 bytes by the proposed scheme with block size 82 8. (c) After embedding 348 bytes by WL scheme with block size 82 8. (d) After embedding 340 bytes by the proposed scheme with block size 32 2 32. (e) After embedding 122 bytes by WL scheme with block size 162 16. (f) After embedding 344 bytes by the proposed scheme with block size 32 2 32. (g) After embedding 344 bytes by WL scheme with block size 82 8.

Two host images were tested, as shown in Figs. 3(a) and 4(a). Also, the image quality after data embedding is taken into ac-count by making two slight enhancements. First, pixels around black-and-white margins are altered with a higher priority. Second, if a block is completely black or white, no data will be concealed in it because changing any bit in is easily visible. To avoid confusion resulting from this enhancement, a block which is not completely black or white, but which will become completely black or white when being hidden with data, is not used for concealing data. However, this block is still converted into a completely black or white block to be transmitted (based on our scheme, 2 bits, at most, of will be modified). Consequently, when the receiver receives a completely black or white block, this block is regarded as

containing no hidden data. Our simulation experience indicates that the data-hiding ratio is only slightly affected by these enhancements, because many choices are typically available to modify a block.

We conclude our comparisons and observations in the fol-lowing.

1) Equal Block Size: We use the same block size and

com-pare images’ quality after data hiding. The results are in parts (b) and (c) of Figs. 3 and 4, where the block size is 8 8. Our results are noisier, since as many as 2 bits in each block are modified, compared to 1 bit of the WL’s. In this case, image quality is traded for a higher data-hiding ratio. In general, our scheme can conceal about four-to-ten times more data than that of WL’s.

(5)

2) Equal Image Quality: This experiment attempts to

equalize the image quality by adjusting the block size. The WL scheme will modify, on average, 0.5 bit in each block hidden with data. The same image quality can be maintained by using a block size that is four times larger than that used in the WL scheme. Thus, in the worst case, 2 bits are modified in each of our blocks, or equivalently bit in each of the WL blocks. Based on this assumption, the experimental results are summarized in parts (d) and (e) of Figs. 3 and 4, where the block size is 16 16 for WL’s scheme and 32 32 for our scheme. In this case, the WL scheme can embed, at most, 1 bit in each 16 16 block, and ours bits in each 32 32 block. Our data-hiding ratio is at least 2.5 higher than that of WL.

3) Equal Amount of Embedded Data: Here, the amount

of embedded data is further equalized by adjusting the block sizes to compare the image quality after data hiding in the WL scheme and the proposed scheme. These results are summarized in parts (f) and (g) of Figs. 3 and 4. Notably, since the hiding ratio of the WL scheme depends on the nature of the host image, we have to adjust the block sizes in order to embed approx-imately the same amount of hidden data for a fair com-parison. Specifically, the block sizes used in Figs. 3(f), 3(g), 4(f), and 4(g) are 28 28, 8 8, 32 32, and 8 8, respectively. In this case, our scheme delivers a better image quality than the WL scheme.

IV. CONCLUSION

This letter has presented a novel steganography scheme ca-pable of concealing a large amount of data in a binary image. The proposed scheme has the following features: it uses a se-cret key and a weight matrix to protect the hidden data, it uses a weight matrix to increase the data-hiding ratio, and it uses an

XORoperator to increase the security. One future research

di-rection is to account for human visual effects during the data embedding process.

REFERENCES

[1] R. J. Anderson, “Stretching the limits of steganography,” in Information

Hiding, Springer Lecture Notes in Computer Science, vol. 1174, 1996,

pp. 39–48.

[2] R. J. Anderson and F. A. P. Petitcolas, “On the limits of steganography,”

IEEE J. Select. Areas Commun., vol. 16, pp. 474–481, May 1998.

[3] W. Bender, D. Gruhl, N. Morimoto, and A. Lu, “Techniques for data hiding,” IBM Syst. J., vol. 35, no. 3–4, Feb. 1996.

[4] Y.-Y. Chen, H.-K. Pan, and Y.-C. Tseng. (2000) A secure data hiding scheme for binary images. CSIE Dept., Nat. Chiao-Tung Univ. [Online]. Available: http://www.csie.nctu.edu.tw/~yctseng

[5] E. Franz et al., “Computer-based steganography,” in Information

Hiding, Springer Lecture Notes in Computer Science, vol. 1174, 1996,

pp. 7–21.

[6] D. Gruhl and W. Bender, “Information hiding to foil the casual coun-terfeiter,” in Proc. Workshop Information Hiding, IH’98, Portland, OR, Apr. 1998.

[7] S. Katzenbeisser and F. A. P. Petitcolas, Information Hiding Techniques

for Steganography and Digital Watermarking. Norwood, MA: Artech House, 2000.

[8] D. Kohn, The Codebreakers: The Story of Secret Writing. New York: Scribner, 1996.

[9] M. Pierrot-Deseilligny and H. Le-Men, “An algorithm for digital water-marking of binary images, application to map and text,” presented at the Int. Workshop Comput. Vision, Hong Kong, China, Sept. 1998. [10] G. J. Simmons, “The prisoners’ problem and the subliminal channel,”

in Proc. CRYPTO’83, 1983, pp. 51–67.

[11] , “Results concerning the bandwidth of subliminal channels,” IEEE

J. Select. Areas Commun., vol. 16, pp. 463–473, May 1998.

[12] , “The history of subliminal channels,” IEEE J. Select. Areas

Commun., vol. 16, pp. 452–462, May 1998.

[13] W. Stallings, Cryptography and Network Security. Englewood Cliffs, NJ: Prentice-Hall, 1999.

[14] R. G. van Schyndel, A. Z. Tirkel, and C. F. Osborne, “A digital wa-termark,” in Proc. IEEE Int. Conf. Image Processing, vol. 2, 1994, pp. 86–90.

[15] M. Wu, E. Tang, and B. Liu, “Data hiding in digital binary image,” pre-sented at the IEEE Int. Conf. Multimedia and Expo, ICME’00, New York, 2000.

[16] M. Y. Wu and J. H. Lee, “A novel data embedding method for two-color facsimile images,” in Proc. Int. Symp. Multimedia Inform. Processing, Chung-Li, Taiwan, R.O.C, Dec. 1998.

[17] J. Zhao and E. Koch, “Embedding robust labels into images for copyright protection,” in Proc. Int. Conf. Intellectual Property Rights for Inform.,