Chapter 3 Properties of Various Watermarking Methods
3.4 Choice of Watermarking Techniques
Since our scalable protection scheme is based on the JPEG2000 standard, we will discuss the compatibility of the watermarking techniques with the JPEG2000 standard in this section. Although each watermarking technique has its own merits, not all of them are suitable for the JPEG2000 standard and some advantages of theirs may disappear due to the JPEG2000 standard characteristics. Some other implementation considerations are also discussed in this section.
Figure 3-7 Choosing embedding positions before R-D Optimization
In the JPEG2000 standard, there are three coding passes for each bit-plane coding and the coded bit-stream can be rearranged to achieve different types of scalability. That is, bits in the same bitplane may belong to different coding layers.
Furthermore, since the tier-2 coding truncates the coded bit-stream to achieve the demanded compression rate or compressed quality, bits to be truncated in a bitplane is unknown until the rate-distortion optimization is done. Thus, bits which will be preserved and bits which will be truncated are unknown at the tier-1 coding stage. For
this reason, how to choose the embedding positions for bitplane-based watermarking without perceptually affecting the image quality before entropy coding is a difficult problem. If we choose the embedding positions before the tier-2 coding as shown in Figure 3-7, the embedded information may be truncated due to the rate-distortion optimization process. If we choose the embedding positions after the tier-2 coding as shown in Figure 3-8, the truncation information of the coded bit-stream is sent to the watermark embedder and embedding positions are then decided. After watermark was embedded, the entropy coding is redone. Also, the rate-distortion optimization is redone and the truncation points may be changed. Thus, some embedded bits may be truncated again and the correctness of the watermarking positions is not assured.
Tier-1 Coding
Watermark Embedder R-D Optimization
Bit-stream Truncation Tier-2 Coding
Watermark Embedding
Tier-1 Coding Tier-2 Coding Watermarked Image Truncation Information
Deciding the Embedding
Positions Host Image
Figure 3-8 Choosing embedding positions after R-D Optimization
Based on the above discussions, the bitplane-based watermarking has the following disadvantages: high implementation complexity, low robustness (correctness), large amount of side information. High implementation complexity is
caused by redoing the tier-1 and tier-2 coding. Low robustness is due to the uncertainty of data truncation, which may lead to wrong extraction results without any attack. Large amount of side information is due to the embedding positions. Because of the low correct detection rate, some capacity is sacrificed to enhance the robustness using the error correction coding (ECC). Thus, the only advantage is the simplicity of the watermark extraction, which is the original advantage before combining with the JPEG2000 standard.
For the correlation-based watermarking, the major parameter is the strength of watermarking, which decides the robustness of the watermark. And the embedding process is very simple while the extraction process is more complex. Thus, the correlation-based watermarking has the following advantages as comparing to the bitplane-based watermarking: relatively low implementation complexity, higher robustness, a small amount of side information. Since the complexity of the correlation-based watermarking is mainly due to calculating the correlation operation in the watermark extraction stage, which is much simpler than the tier-1 and tier-2 coding, the correlation-based watermarking has relatively low implementation complexity comparing to the bitplane-based watermarking. Because the truncation of coded bit-stream in the tier-2 coding does not affect the extraction results seriously, a watermark with certain intensity should provide sufficient robustness against attacks.
The side information is not high since there are no embedding positions to be recorded.
Since the correlation-based watermarking seems to have more advantages than the bitplane-based watermarking for our purpose, it is thus chosen to be the watermarking method used in the protection scheme of scalable media.
Chapter 4
Design of Watermark
Since the correlation-based watermarking is chosen to be the transmission method used in layered protection of scalable media, the only thing to be done is to try our best to improve the embedding efficiency or to increase the data capacity. There are two types of scalability, resolution scalability and quality scalability (PSNR scalability). And the watermarking technique aiming at each of them will be designed separately based on their own requirements.
In this chapter, the algorithm design flow is described. Firstly, the original watermarking technique which is to be improved is introduced. Second, various methods are proposed to improve the embedding efficiency and the perceptually removable watermark is then presented. At last, the layered protection scheme is a combination of the JPEG2000 standard and the perceptually removable watermark.
This scheme allows unauthorized users to preview only the coarse image by decoding the base layer.
4.1 The Original Watermarking
The perceptual watermark is proposed by M. Saito et al. [14] and is the extension of the scheme proposed by Barni et al. [24]. Since the watermarking technique proposed by Barni has been briefly introduced in section 3.1.1, here we only describe the entire watermarking scheme in sketch and point out the difference between Barni and Saito. Similar to JPEG2000, a color image with RGB components are first
transformed to the YCbCr domain and the watermark is embedded in the Y coefficients. For easy understanding, some previous formulas and figures are repeated here.
Watermark Embedding
As shown in Figure 4-1, the image to be watermarked is first decomposed through DWT into four levels: I is the subband at resolution level l = 0,1,2,3 and lθ with orientation θ= {0,1,2,3}. The watermark is made of a pseudorandom binary sequence (±1) and is embedded into the coefficients of the highest resolution I . θ0
3
Figure 4-1 An image with four decomposition levels using the DWT
For data embedding, formula (4.1-1) maps a 1-D pseudorandom binary sequence mh to a 2-D sequence xθ( ji, ), where M and N are the dimensions of the largest weighting function based on the HVS model described in section 3.1.1.
)
Watermark extraction of the embedded data is accomplished without referring to the original image. The only required information is the seed used to produce the PN sequence mh. With the correct produced mh, the correlation values are calculated by (4.1-6). Since the watermarking technique plays a critical role in the gradual data recovery process in our specific layered protection scheme, the extraction result saying that the image is un-watermarked is useless. Thus, the threshold used in [14] is not used and the data extraction scheme is simply expressed by (4.1-7).
∑
4.2 Evolution of the Watermarking Technique
In this section, the architecture of the layered protection of scalable media is described. The watermarking technique is as a tool used for sequential data transmission. Thus, this is necessary to know the information that the watermark carries and the data relationship between layers. After the basic watermark is designed, we improve its embedding efficiency and eventually a perceptually removable watermarking technique is proposed. The proposed technique has several features different from the traditional demands for watermark but it is very useful for our specific purpose, that is, the data transmission.
4.2.1 Layered protection structure
Consider the architecture of layered protection of scalable media, which is mentioned in section 2.3 and re-shown as Figure 4-2. The watermark is used to carry the data recovery information such as the decryption key and the extraction parameter for the next layer.
Decrypte() Compose()
B0or Xi Bi
Delay
Extract() Decryptf()
Delay param()
key()
Bi-1
{Gi}
Wi
Ki
Fi
Pi
Pi-1
Figure 4-2 Decryption and decoding of layer-protected content
In our application, the data encryption standard (DES) is chosen to protect the context of the first enhancement layer, the advanced encryption standard (AES) is used for protecting the second enhancement layer, and the base layer serves as a preview layer without any protection. Hence, as shown in Figure 4-3, the information that should be embedded in the base layer is 96 bits. Among them, a piece of 32-bits data is the seed used to produce the pseudorandom sequence mh and the other piece of 64-bits data is the decryption key for the first enhancement layer. Since the first enhancement layer is the last layer for data embedding, it does only contain the decryption key used for AES, which is, a 128-bits key.
Figure 4-3 Relationship between layers
In the following sections, the resolution-based watermark used for protection of the data with resolution scalability is first discussed. After the resolution-based watermark is completely developed, it is slightly modified to form the PSNR-based watermark which is used for protecting the data with PSNR scalability.
4.2.2 The resolution-based watermarking technique
For data with resolution scalability, as shown in Figure 4-4, the base layer contains the subbands of the lowest resolutions while the enhancement layers are composed of the subbands of the higher resolutions.
3
Figure 4-4 Embedding positions of the watermark
The resolution-based watermarking embeds the data of each layer into the corresponding colored area in Figure 4-4. The embedding scheme is almost the same as (4.1-1) to (4.1-5), and the only difference is that M and N are no longer the dimensions of the largest subband but the dimensions of the subband in the embedding layer. Figure 4-5 shows an example of embedding the first bit of a 6-bits data into layer k, and the colored pixels are the positions for embedding the first bit.
1
Ik
Layer k
M
kN
k2
Ik
0
Ik
Figure 4-5 Embedding positions of the first bit of a 6-bit data
As shown in Table 3-1, we could find that the data capacity of the correlation-based watermark for a 512x512 image is low. Even though the data to be embedded into the base layer is only of 64 bits, the extracted data may not be totally correct. Therefore, our goal of embedding 96-bits data into the base layer is not possible. It is also very difficult to embed the data of 128 bits into the first enhancement layer for the same reason. All the following sub-sections describe our efforts to improve the data capacity.
4.2.2.1 Selection of the seed
Since the correctness of the extracted data depends on whether the watermark pattern is correlated to the host image, the seed used to produce the pseudorandom sequence is an important factor for improving embedding capacity of watermarking.
How to choose a seed that minimizes the correlation value between the produced pseudorandom sequence and the host image is a difficult problem. Although there are many existing methods for finding the optimum parameter to minimize a function,
such as the traditional calculus-based methods (such as gradient decent) and the genetic algorithms (GA) which uses directed random searches to locate optimal solutions in complex landscapes [27], it seems none of them could solve the aforementioned problem because of the irregular cost function associated with the pseudo-random number function.
Because of the specific property of the pseudorandom function that the pseudorandom sequences produced by adjacent seeds are nearly unrelated to each other, a full search among a certain range of seeds may be the only way to identify a good seed for producing the desired watermark pattern. Thus, we set the beginning and the end seeds and examine all seeds in between to find out the best seed for producing the desired watermark pattern.
Figure 4-6 Seed search procedure
It is important to note that using the pseudorandom function to produce a watermark which is completely uncorrelated to the host image is impossible [28].
Thus, to further improve the embedding efficiency, we should take the advantage of the characteristics of the host image itself. That is, the watermark pattern does not need to be uncorrelated to the host image but it should make the extracted data as
correct as possible. For this reason, we look for the seed that leads to a better extraction performance rather than the seed that produces a watermark pattern less correlated with the host image. It implies that the watermark embedding and extraction stages are pre-done to examine the correctness of extracted data as shown in Figure 4-6.
Table 4-1 shows a comparison of the embedding efficiency between the watermark embedding with and without the seed search procedure. The compression rate is of 20. It is obvious that the embedding efficiency is improved by the seed search procedure but the performance is not good enough. Therefore the other methods need to be employed to improve the watermark extraction performance.
baboon fruit Lena peppers
BER_1 0.044062 0.007734 0.000000 0.000000 Before
compression BER_2 0.109479 0.093750 0.042500 0.021042 BER_1 0.083828 0.028672 0.000000 0.000156 BER_2 0.135000 0.106250 0.042500 0.022187 After
compression
PSNR 25.3177 35.149 36.727 36.044
After Seed Search Procedure is applied
BER_1 0.000000 0.000000 0.000000 0.000000 Before
compression BER_2 0.018125 0.008229 0.000000 0.000000 BER_1 0.059766 0.009687 0.000000 0.003672 BER_2 0.037500 0.018333 0.002917 0.002917 After
compression
PSNR 25.324 35.1643 36.6943 36.0163
Table 4-1 Results of correlation-based watermarking
4.2.2.2 The Error Correction Code (ECC)
In 1948, Shannon shows that errors in the received data induced by a noisy channel can be reduced to any desired level by a proper channel code [29]. In Shannon’s channel coding theorem, it is said that every channel has a capacity C and there exist codes of rate R (R<C) that, with the maximum likelihood decoding, have an arbitrarily small decoding error probability.
Figure 4-7 Error Correction Codes
There exist two categories of the error correction codes (ECC), the block (memory-less) codes and the convolutional (memory) codes [30]. As shown in Figure 4-7, a block code contains K message symbols and (N-K) parity symbols which are produced by the message symbols for data correction, and a convolutional code contains K message symbols and (N-K) parity symbols which are produced by the current and the previous message symbols for data correction. Using ECC to improve the data capacity of an image is inappropriate. It is because that the basic concept of the channel coding theorem is sacrificing the code rate to reduce the error probability.
Since the data capacity of an image is also the maximum rate for data carrying, there is little room for carrying extra bits introduced by the channel coding.
The only use of the error correction codes (ECC) is to correct a small amount of random errors and the cost is sacrificing some data capacity.
Figure 4-8 Parity-check matrix of the (39, 32) SEC-DED code
Figure 4-8 shows a parity-check matrix of a shortened Hamming code used to produce the parity symbols of the (39, 32) single-error-correction, double-error- detection (SEC-DED) code [26], which can correct one error bit and detect two error bits of a 39-bits data that carries 32-bit information. Since the shortened Hamming code is adopted, the data that should be embedded in the base layer becomes a 117-bits data and the first enhancement layer should contain a 156-bits encoded key.
4.2.2.3 The Perceptually Removable Watermarking (PRW)
Since the aforementioned method all failed to apparently improve the data capacity of an image, we try a different approach. Recalling that the purpose of watermarking in our applications is to transmit the encrypted data together with the decryption key, thus, the quality of the watermarked image is not important, as long as the image can be recovered at the end of the entire process. If we can perceptually remove the embedded watermark after the correct watermark is extracted, we achieve the goal of producing good quality images at the end.
Based on the idea that the quality of the watermarked image is not important, a perceptually removable watermark is proposed. In this scheme, the watermark is strengthened to increase the data capacity. Thus, the quality of the watermarked image is lower. As shown in Figure 4-9, the data to be embedded is firstly sent to the ECC Encoder to produce the shortened Hamming code for reducing the decoding errors.
The seed search scheme is still used to improve the embedding capacity. Still the watermark intensity is proportional to the JND values of the host image so that the recovered image visual quality reaches its best.
The key assumption of the proposed perceptually removable watermarking method is that the JND values calculated before and after the watermark embedding are very similar. That is, if we can derive the same watermark intensity, the JND values, from either the host image or the watermarked image, we can easily remove the watermark from the watermarked image by simply subtracting the watermark of the same intensity from the watermarked image. Since the watermarked image is different from the original one, this assumption is not totally valid. A recursive method is thus proposed. As shown in Figure 4-9, the JND values are firstly calculated and used to remove the watermark. Then, the reconstructed image after the first iteration is fed back to re-calculate the JND values for the second iteration watermark removing. This process repeats until the JND values are converged to fixed values, that is, the JND values calculated in the previous and current iterations are the
same. In our experience, the JND values will converge to a certain range in only two iterations. Thus, for reducing the computational complexity, only two iterations are applied to derive the desired JND values.
Figure 4-9 Architecture of the Perceptually Removable Watermarking: Data Hiding, Data Extraction and Image Reconstruction
Because the watermark removing process can not be perfect, the recovered image after watermark removing still contains a small amount of residual watermark.
Now if the intensity of the residual watermark is proportional to the JND values, the perceptual quality affected by the residual watermark will be very small. Thus, to reduce the visibility of the residual watermark in the image, the JND values are needed for deciding the intensity of the watermark. Someone may question that why not using a fixed value to be the intensity of the watermark and then removing the
watermark using the correct intensity rather than the estimated JND values. It is because the image is passed through the JPEG2000 encoder and coefficients will be quantized to achieve the objective compression rate. For this reason, the watermarked image is somewhat changed and the fixed watermark intensity is no longer meaningful. Thus, the reconstructed image still contains some residual watermark and in this case, the perceptual quality is no longer protected by the HVS model. Table 4-2 is a comparison between the fixed-value watermark intensity and the watermark intensity proportional to the JND values.
baboon fruit Lena peppers
Fixed intensity
BER_1 0.00 0.00 0.00 0.00
BER_2 0.00 0.00 0.00 0.00
PSNR 24.97 35.45 37.85 37.18
Proportioned intensity
BER_1 0.00 0.00 0.00 0.00
BER_2 0.00 0.00 0.00 0.00
PSNR 25.42 35.99 38.34 37.49
Table 4-2 Watermarking with fixed and proportioned intensities
One drawback of the watermark with fixed intensity is the terrible perceptual quality of the base layer, which contains the watermark and is served as the preview layer without watermark removing. It is also the main reason that we do not use the watermarking scheme with fixed watermark intensity.
4.2.3 The PSNR-based watermarking technique
Protecting the scalable media with PSNR scalability is much more difficult than protecting the scalable media with resolution scalability. It is because the coefficients of one layer in resolution-based scalable media are independent to coefficients of other layers. Thus, any change to the coefficients of one layer has no effects on the coefficients of any other layer.
For the PSNR-based scalable media, as shown in Figure 4-10, the coefficients of each layer is a collection of bits derived from the coefficients of the whole image.
That is, one or several bits of a coefficient are collected to form a portion of a layer in
That is, one or several bits of a coefficient are collected to form a portion of a layer in