Chapter 1 Introduction
1.2 Related studies
1.2.1 Visual Cryptography
Introduced by Naor and Shamir[1], visual cryptography (VC) is an approach to decrypt secret images using the human visual system. Using VC, a secret image can be revealed by stacking the transparencies generated in the encryption process. Since the decoding process of VC depends on the inspection of stacked images using the naked eye, it has the potential to be utilized in critical environments without computer resources. We may use the simple example in Fig. 1.1 to describe VC. Fig. 1.1(a) shows the binary secret image. After using the encoding process proposed by Naor and Shamir [1], the generated transparencies are extremely noisy, as in (b) and (c). Fig. 1.1(d) shows the result of stacking the two transparencies (b) and (c) together.
(a) (b)
4
(c) (d)
Fig. 1.1. An example of VC. (a): a secret image; (b-c): the two transparencies generated for (a) using the VC scheme of Naor and Shamir [1]; (d) the result of stacking (b) and (c).
Many VC related studies have been proposed. For example, [18-21] introduced multi-secret VC; [20, 22-25] proposed non-expanded VC so that the created transparency could be compact; and some other VC schemes [26-29] enabled VC to have more applications.
The aforementioned Wu and Chang [3] proposed a method to generate two circle transparencies for sharing two secret images. When rotating one transparency by a pre-specified angle and then stacking it with another transparency, the second secret image could be revealed. With their method, the size of each transparency was four-fold larger than that of each secret image. Fang and Lin [18] used two rectangular transparencies to share two secret images. In their method, besides revealing one secret image by stacking the two transparencies, shifting one of the transparencies and then stacking them again could also reveal another secret image. The size of each transparency was also four-fold that of each secret image. Shyu et al. [19] extended the multi-secret VC scheme of Wu and Chang [3]
from single rotation to several rotations so that they could encode # images in two transparencies. Nevertheless, the transparencies were still 2# times the size of each secret image.
To optimize usage of the transparencies, reducing the size of the transparencies is also an topic for study. There are several non-expanded VC schemes. For example, Yang [20]
introduced a probability-based method and Shyu [22] presented a random-grid-based method.
In both methods, the size of each transparency is the same as that of the secret image.
Therefore, their methods are particularly suitable for situations with storage restriction.
However, in their methods, only one secret image is hidden when several transparencies are created.
5
1.2.2 Polynomial secret sharing, secret image sharing, information dispersal algorithm, and Reed-Solomon code
In 1979, two independent researchers, Blakley [5] and Shamir [4] proposed a polynomial secret sharing scheme. In their (t, n) threshold scheme, a dealer distributes a secret number into n shadows, with each of n participants holding one shadow. The generated shadows have two properties: (1) any information about the secret message (except for the message size) cannot be extracted from any t−1 or less shadow. (2) the size of each shadow is the same as the size of the secret message. If a secret sharing scheme has property 1, the sharing scheme is called perfect security. Moreover, if the sharing scheme has property 1 and 2, the sharing scheme is called ideal. Shamir’s polynomial secret sharing[4] is an ideal secret sharing scheme. Later, Shamir [4] introduced the concept of weighted secret sharing in his seminal work. In Shamir’s weighted secret sharing with the (t, n) threshold scheme, each of the n participants is assigned with a positive integer weight wi where i=1, 2,..., n and 1≤wi ≤ − . t 1 Then the dealer would distribute a secret number into
∑
=n i wi
1 shadows, and the number of shadows that each participant held would be equal to their corresponding weight value. The secret could be reconstructed if the sum of the weights of the received participants is no less than the threshold t.
When the secret data is a secret image rather than a secret number, using Blakley’s or Shamir’s (t, n) threshold scheme [4, 5] to share the secret image will waste much memory space because the size of the secret image is usually very large. To reduce memory space, Thien and Lin [7] proposed a secret image sharing method derived from Shamir’s scheme, and Tso [30] proposed a secret image sharing method based on Blakley’s scheme. In both methods, the size of each shadow is smaller than that of the secret image. In addition, based on Thien and Lin’s secret image sharing method, the progressive secret image sharing schemes [26, 27, 31] were proposed in succession.
An Information Dispersal Algorithm (IDA) [6] was proposed by Rabin. Under this scheme, a file can be divided into n shadows, and any t of the n shadows can reconstruct the file. IDA does not care about the security of shadows, and the major advantage of IDA is that the size of each shadow is 1/t, which is smaller than Shamir’s method 1. P. Béguinand A.
Cresti [32] prove that the size of each shadow 1/t is minimal, if the entropy of the file is maximal. Preparate [33] proposed a fast sharing method based on Fast Fourier Transform (FFT) over an finite field.
6
Reed-Solomon code (RS code) [34] utilizes error control coding proposed by Reed and Solomon. The t information digits are transformed into n digits to form a code, and if any
(n−t)/2
of the n digits are modified, the t information digits can also be precisely extracted. Preparate [33] later extended this method, proposing the concept of sharing RS code.1.2.3 Data hiding methods
Data hiding is a technology which can embed data in images. The LSB (Least Significant Bits) substitution method is probably the simplest embedding method. For example, if two secret bits (11)2=3 are to be embedded in an 8-bit pixel value (10010100)2=148, the two least significant bits of 148 are replaced, and the stego-pixel value is (10010111)2=151, which can extract (11)2 easily. To improve the LSB substitution method, Thien and Lin [8] use (10010011)2=147 as the stego-pixel, since 147 is closer to 148 than 151 is, and the last two bits of 147 can still extract (11)2 easily. Other papers have been published based on this or similar observations. For example, Lin et al. [35] introduced an embedding algorithm which extended the modified LSB substitution method by using a distortion tolerance. Yang [36] embedded data based on an inverted pattern approach to improve the stego-image’s quality in the LSB method, and Wang [37] used a threshold to decide the modulus base of the embedding function.
Rather than using a pixel as the embedding unit, the LSB-matching method [9, 10]
considers a block of several pixels simultaneously. Mielikainen [10] proposed an embedding method which embeds 2 bits in a block of 2 pixels. Li et al. [9] defined a generalized LSB matching (G-LSB-M) scheme to further reduce distortion. Zhang and Wang [38] embedded a digit which has (2z+1) possible values in each z-pixel block. When the secret data contains images, another type of research focuses on [39-41] the redundancy of the secret images to improve stego-image quality. For various media formats, Tseng et al. [42] embedded data in binary images, and Wu et al. [43] in palette-based images, whereas Liu and Liao [44] used JPEG images. Lee et al.’s method [45] was for binary images, and the embedding was based on Hamming codes to reduce the frequency of flipping pixels. Wang and Lu [46] used a Vector Quanization (VQ) index file as the host media. Tseng and Hsieh [47] even proposed a reversible method, so that the host images could be recovered from the stego-images without loss, but the price of being reversible was a smaller embedding rate (for example, it embedded only 0.22 bits per pixel to get a Lena stego-image of 47.31 dB PSNR). Some other kinds of
7
data embedding consider the content of the host image, i.e. embedding more bits in the coarse area of the host image. For example, Wu and Tsai [48] introduced a method based on pixel value differencing, whereby each block of the host image embeds a dynamic number of bits by altering the pixel value difference. Likewise, Wang et al. [49] used a pixel value differencing and modulus function to effectively reduce distortion. Zhang and Wang [50]
dynamically changed the base of secret data to control the embedding rate of the stego pixels.
Yang et al. [51] also proposed adaptive data embedding in edge areas, and then the use of a modified LSB substitution method to reduce distortion. Yang et al. [52] estimated the amount of the embedded data by exploiting the brightness, edges, and texture of the host image.
1.2.4 Fragile watermarking and Semi-fragile watermarking
An image authentication method generates some data which will be used to check the accuracy of the digital media in the future; the authentication data can be stored in another file (this is the so-called digital signature approach [53, 54]), or embedded in the digital media itself (i.e. watermarking approach [55-66]). In recent years, some watermarking studies have focused on image tampered-region detection and recovery [58-66].
Lin et al. [59] proposed a watermarking technique for tamper detection and recovery, based on a three-level hierarchical structure and block-mapping sequence. In the three-level hierarchical structure, a block is judged as “applicable” if the block passes three inspections.
If a block is judged as “non-applicable”, then the recovery data is embedded in LSBs of another block whose address is determined by a block-mapping sequence. Lee and Lin [63]
proposed a watermarking technique which embeds dual watermarks in an image. The detection algorithm is similar to Lin et al.’s method [59], but the block size is 2×2, rather than the 4×4 used in Lin’s method [59]. If a block is judged as “non-applicable”, then the copies of recovery data are embedded in LSBs of another two blocks, which are addressed by a block-mapping sequence. The two copies of recovery data (dual watermarks) are used to increase the chances for block recovery. In Wang and Tsai’s method [64], the image is divided into two regions; for the Region-of-Interest (ROI), the recovery data are encoded by fractal encoding, and embedded in other blocks which are selected by permutation. For remaining regions, no recovery data are embedded. If a damaged block is located in an ROI, then the fractal code is extracted for recovery; otherwise, the block is recovered by an image-inpainting technique. Chan and Chang [66] proposed an image authentication method based on Hamming code consisting of three components; the Hamming code, Torus
8
automorphism and bit rotation. The parity check bits for each pixel were generated by the Hamming code. The embedding locations for the parity check bits were decided by Torus automorphism, and the bit rotation was used to improve security. Zhang and Wang [67]
proposed an elegant watermarking method, which can restore a tampered region without error.
This method is based on reversible data hiding, which can extract the whole host image from the stego-image without error.
In general, if a watermarked image is processed by some content-preserving operations (i.e., JPEG compression), verification ability should still exist within a certain level of the operation. This type of watermarking method is said to be semi-fragile, and in the semi-fragile watermarking method proposed by Ho and Li [11], users choose the lowest JPEG quality factor they can tolerate, and the verification data is generated and embedded in the quantized DCT domain. Their experiments demonstrated that their method could resist JPEG compression (up to a level of Quality factor QF). In the Lin et al. [12] method, users also choose the lowest JPEG quality factor they can tolerate, and the verification data is generated from the low/middle frequency of the DCT domain, followed by embedding in the high frequency domain. Their experiments showed that their method can also resist JPEG compression (up to a QF level). However, these two methods [11, 12] only embed verification data, and there is no recovery data.
There are some semi-fragile watermarking techniques which also embed recovery data in the watermarked image, and the image itself can recover any tampered regions. Lin and Chang [13] proposed two approaches to semi-fragile watermarking, one of which has verification ability only, while the other has both verification and recovery ability. The verification data is generated from the DCT coefficients, and the recovery data is generated from a quarter-size shrunken sub-image of the host image (if recovery ability is required).
Then all of the generated data is embedded in the DCT domain. Their method can resist both JPEG compression and brightness adjustment within a reasonable range. Hsieh et al. [14] also proposed a watermarking scheme with damage-recovery ability. The recovery data is calculated from the host image, and then three copies of the recovery data are embedded in the DCT domain of the host image. Their experiments showed that their method could resist JPEG compression, brightness adjustment and contrast adjustment. Jiang and Liu [15]
proposed an authentication-recovery scheme. Their verification data is a random number sequence generated by a key, and their recovery data is generated from DCT coefficients of the host image. The two sets of data are embedded in the two-LSBs (the two least significant
9
bits) of the image. Their experiments showed that their method could resist JPEG compression and small-area replacement of the watermarked image.