Layered Protection of Scalable Media - JPEG2000, Watermarking, and Layered Protection

Chapter 2 JPEG2000, Watermarking, and Layered Protection

2.3 Layered Protection of Scalable Media

With the widespread use of Internet, all digital data are easy to obtain, thus, a protection scheme is needed for the intellectual property. Moreover, scalable media coding becomes more and more important due to the various bandwidth of transmission such as mobile communication. Scalable coding implies that a multimedia is divided into one base layer and several enhancement layers, and each enhancement layer could independently improve certain quality of the reconstructed image.

There are three categories of scalability, that is, spatial scalability, temporal scalability, and signal-to-noise (SNR) scalability. Spatial scalability means that each layer has different resolution, and we could get the reconstructed image with higher resolution when more enhancement layers are decoded. Temporal scalability is a scheme that enhancement layers increase the overall frame rate of a film. And SNR scalability makes different layers have different amount of quality to improve.

Traditionally, each layer of scalable coding is protected by cryptographic techniques, but the synchronization problem [21] between the encryption key and the encrypted content is still an issue. Therefore, if we could transmit the encryption key with the encrypted content, the synchronization problem will be resolved. Thus, a scheme which combines encryption and robust watermarking is proposed [22]. Before introducing this scheme, a typical MPEG-2 conditional access receiver [23] is introduced.

As shown in Figure 2-19, the digital content is protected by “Control Word”

(CW), and the control word can be obtained through a two-step decryption flow, that is, to retrieve CW, the Service Key (SK) should be decoded first, and the decoding of SK is based on the combination of User Key (UK) and the Entitlement Management Message (EMM). User Key may be contained in a Smart Card and the EMM can be used to change the status of the user accessibility of contents, that is, if the

subscription is overdue, a broadcaster could disable a user’s access by changing the EMM. After decoding the SK, an Entitlement Control Message is required to decode CW. Because of the complicated computation of the Control Word, it is hard to handle the synchronization problem of cryptographic techniques. Thus, a new architecture is proposed [22] and shown in Figure 2-20; in this scheme, the EMM and ECM are embedded into the content and transmitted together, then the problem of synchronization is resolved. We should note that the ECM and EMM in Figure 2-19 are corresponding to the key function key() and the watermark Wi in Figure 2-20, and UK, SK, and CW are corresponding to Gi, Fi, and Ki.

Descrambler Scrambled

bitstream Video/Audio/Data

Control Word (CW)

Decipherment of CW

Service Key (SK)

Decipherment of SK

User Key (UK) ECM

EMM

Figure 2-19 A typical MPEG-2 conditional access receiver

In Figure 2-20, Xi is the encrypted enhancement layer which can be decrypted using the decryption key Ki, and Ei is the resulting enhancement layer used to form the ith constructed base layer Bi; Wi is the watermark extracted from the constructed base layer B_i-1 with extraction parameter P_i-1, and F_i is the secret information decrypted by the specific key G_i that can be used to get the decryption key K_i and

extraction parameter P_i. In short, the relationships between all coefficients in Figure

Figure 2-20 Decryption and decoding of layer-protected content

)

Chapter 3 Properties of Various Watermarking Methods

In this chapter, few watermarking techniques will be introduced in detail. We will describe and analyze the characteristics of each watermarking technique and simulate them to examine their features. At the end, we will choose a watermarking technique that is pertinent to the layered protection of scalable media introduced in section 2.3.

3.1 Introduction to Watermarking Techniques

As described in section 2.2.5, watermarking techniques can be divided into two main groups, the correlation-based and the noncorrelation-based watermarking. Here we will introduce two examples in detail and describe their characteristics.

3.1.1 The correlation-based watermarking

The correlation-based watermarking is a simple way to embed watermark. As shown in Figure 3-1, the watermark with a chosen intensity is added to the host image.

The procedure of watermark detection is to calculate the correlation between the watermark and the watermarked image. If the correlation value is below a threshold value T, it is claimed that the image is un-watermarked; on the contrary, if the correlation value is higher than T, the image is watermarked.

If we want to embed information into the host image, for example, a one-bit data,

then we can embed “0” by adding a negative watermark pattern into the host image and embed “1” by adding a positive watermark pattern into the host image. Thus, if the correlation value is below –T, we say that the embedded data is “0”, and if the correlation value is above T, the embedded data is “1”. Then, if we want to embed several bits into an image, we could divide the image into several blocks and embed one bit into a block. Figure 3-2 is an example of embedding “0110” into the host image. We should note that (k0, k3) should be negative and (k1, k2) should be positive.

Figure 3-1 Embedding procedure of the correlation-based watermark

Figure 3-2 Embedding “0110” into the host image

After introducing the basic concept of correlation-based watermarking, we will introduce a watermarking scheme which combines the human visual system (HVS) to improve the embedding efficiency [24]. In this scheme, the embedding position of each bit is first decided, and the HVS model is then used to decide the watermark intensity to improve the efficiency of embedding. The details will be discussed as follows.

Watermark Embedding

As shown in Figure 3-3, the image to be watermarked is first decomposed through DWT in four levels: I is the subband at resolution level l = 0,1,2,3 and _l^θ with orientation θ= {0,1,2,3}. The watermark consists of a pseudorandom binary sequence (±1) and is embedded into the coefficients of the largest resolution I . Also, ^θ₀

we can embed the watermark into coefficients of the other resolutions to achieve higher robustness but with a more interfered watermarked image.

Figure 3-3 An image with four decomposition levels using DWT

In more details, a 1-D pseudorandom binary sequence m_h is rearranged to a 2-D sequence x^θ( ji, ) (3.1.1-1) where M and N are the dimensions of the largest function that allows to match the masking characteristics of the HVS.

)

The human visual system is proposed by Lewis and Knowles [25] to improve the quality of image compression and is modified by Barni, Bartolini and Piva [24] to refine the efficiency in watermarking application. The HVS model is developed by the

observing characteristics [24].

1. The eye is less sensitive to the noise in high resolution bands and the bands having orientation of 45° (i.e. θ= 1); that is, the eye is less sensitive to the noise with higher frequency.

The three terms used to decide the quantization step are described as follows.

The first term Θ(l,θ) is associated with the noise sensitivity depending on the

Since the eye is less sensitive to the noise in the high or low brightness areas and the coefficients of the lowest resolution in the DWT domain represent the brightness of certain area in an image, the second term Λ(l,i, j) takes into account the local brightness based on the gray-level values of the low pass version of the image. In (3.1.1-5), L(l,i, j) is the gray-level values of the low pass version of the image.

Based on the consideration that the human eye is less sensitive to changes in very dark neighborhood of the pixel. In particular, this term is composed by the product of two contributions: the first is the local mean square value of the DWT coefficients in all higher subbands, while the second is the local variance of the low-pass subband. The first item correlates to the distance from the edges, whereas the second one represents the texture. These two terms are multiplied together to reflect the initial consideration that the eye is less sensitive to noise in highly textured areas and near the edges.

Watermark Detection

Watermark detection is accomplished without referring to the original image and is called as the blind detection. The correlation between the watermarked DWT coefficients and the watermarking sequence to be tested for presence is computed by (3.1.1-8), and the correlation value is compared to a threshold T^ρ chosen in such a way to grant a given probability of false positive detection. If ρ is below Tρ, we would say the image is un-watermarked, while the image is said to be watermarked if ρ is

3.1.2 The bitplane-based watermarking

The bitplane-based watermarking is the method which directly replaces the original bits of specific positions by the embedding data. As shown in Figure 3-4, there are eight coefficients with eight bits per coefficient, and each of them chooses a specific position (i.e. the colored area) for data embedding. Since the embedding mechanism is simply replacement, if we know the embedding positions, the embedded data can be easily extracted without any mathematical operation.

Figure 3-4 Embedding data “11100111” in specific positions

The embedding efficiency of this method depends on the choice of the embedding positions. Thus, how to choose the embedding positions is a big issue. The most common one is the LSB modification described in section 2.2.5 but this method is too fragile and the watermark can be destroyed easily. Another method is proposed by T.S. Chen, J. Chen and J.G. Chen; they embedded the data in the fifth bit of the coefficients and smartly modified the other bits to reduce the quality loss [15]. Thus, the proposed method makes a good balance between robustness and imperceptibility.

3.2 Analysis of Watermarking Techniques

In this section, we will discuss advantages and disadvantages of the watermarking techniques described above.

Side information

Since the extraction procedure of correlation-based watermarking is only to compute the correlation value between the watermarked image and the watermark, the side information for watermark extraction is only the seed used to generate the watermark pattern.

For bitplane-based watermarking, the embedding positions are needed for watermark extraction. And since that recording the embedding positions needs a lot of extra bits, the side information required for the bitplane-based watermarking is much more than the side information required for the correlation-based watermarking.

Data Capacity

The data capacity is the data size for an image to hide data imperceptibly. We should note that the correctness of correlation-based watermarking is based on the randomness of the pixels for data embedding in an image; that is, the more random the pixels are, the more correct extraction results the extracted watermark will have.

Thus, we use more pixels to embed one-bit data to keep the pixels random enough, and the cost is the lower data capacity.

Since the bitplane-based watermarking directly replaces the bits of the embedding positions by the embedding data, its data capacity is theoretically the same as the number of pixels and is much larger than the data capacity of the correlation-based one.

Robustness

As most people know, the correlation-based watermarking is a very robust scheme since it is closely combined with the coefficients of the host image. And the bitplane-based watermarking is more fragile and can be destroyed by only re-quantizing the coefficients using another quantization step; that is, if a coefficient is re-quantized, we will find the embedded data in neither the original positions nor the other positions.

Security

The security refers to how difficult the embedded data are extracted by an unauthorized person. For correlation-based watermarking, if the seed used to generate the watermark pattern is known, there will be no secret of the embedded data. For bitplane-based watermarking, the security is based on the embedding positions. Thus, if the embedding positions are known by an unauthorized user, the embedding data can be extracted directly.

Since the information of embedding positions is often much more than the seed used to generate the watermark pattern, typically the bitplane-based watermarking is more secure. But if we fix the embedding positions in a specific bitplane to reduce the side information in some applications, the bitplane-based watermarking can be less secure but the side information is small.

Computational Complexity

Because of the extraction procedure of the correlation-based watermarking is to calculate the correlation value between the watermark pattern and the watermarked image, its computational complexity is much higher than the complexity of the bitplane-based watermarking. The extraction procedure of the latter is only examining the embedding positions and directly draws out the embedded data. This drawback of the correlation-based watermarking also makes it hard to implement in a real-time

system.

Removability

Although an imperceptible watermark is acceptable in most applications, there are still some applications such as storage of medical images cannot tolerate any distortion. In these applications, a scheme of removable watermarking is necessary.

As shown in Figure 3-5, to remove a watermark embedded by correlation-based watermarking is simply subtracting the watermark pattern from the watermarked image. On the other hand, because the watermark embedding procedure of the bitplane-based watermarking is the replacement, to remove a watermark without other side information is not possible.

Figure 3-5 Procedure of removing the watermark from the watermarked image

3.3 Simulations

In this section, we run a simple experiment to verify some features of the watermarking techniques we discussed earlier. The watermarks are embedded in the commonly used test images such as baboon, fruit, lena and peppers, which are shown in Figure3-6.

Figure 3-6 Test images

Before listing the results of simulation, the embedding algorithm is introduced.

At first, the host image is decomposed into four resolutions shown as Figure 3-3. And areas of the three higher resolutions are then chosen to embed watermarks using the correlation-based and the bitplane-based watermarking techniques.

In our simulations, 160 bits are embedded in the I₀ area, 128 bits are embedded in the I1 area and 64 bits are embedded in the I2 area. The strengths of watermarks are tuned to make the embedding imperceptible, and the effect of image compression is also considered; that is, the watermark extraction of a watermarked image passed

through a JPEG2000 encoder with compression rate of 20 is conducted. The simulation is done 100 times with different watermarks to be embedded and the mean values of the statistics are shown in Tables 3-1 and 3-2. The bit error rate (BER) is the ratio of the number of bits that the extracted data are different from the embedded data to the total bits of the embedded data. If the bit error rate (BER) is approximate to 0.5, it means that the watermark is almost destroyed.

In Tables 3-1 and 3-2, the bitplane-based watermarking has a much better performance before passing through the JPEG2000 encoder, but the performance decreased substantially after coded by JPEG2000. Thus, it can be said that the bitplane-based watermarking has a much larger data capacity but is much more fragile to the image compression as compared to the correlation-based watermarking. The simulation results match our predictions in section 3.2.

baboon fruit lena peppers

BER_0 0.000000 0.000000 0.000000 0.000000

BER_1 0.021250 0.000000 0.000000 0.000000

BER_2 0.059531 0.040156 0.017656 0.003594

PSNR 40.056029 41.379366 41.076705 40.336315 After JPEG2000 compression

BER_0 0.018250 0.000000 0.000000 0.021562

BER_1 0.102188 0.041016 0.003281 0.016328

BER_2 0.090938 0.058906 0.024531 0.021250

PSNR 25.3193 35.1496 36.6823 36.0448

Table 3-1 Results of correlation-based watermarking

baboon fruit lena peppers

BER_0 0.000000 0.000000 0.000000 0.000000

BER_1 0.000000 0.000000 0.000000 0.000000

BER_2 0.000000 0.000000 0.000000 0.000000

PSNR 47.121854 47.430821 47.490265 47.477924 After JPEG2000 compression

BER_0 0.487031 0.507734 0.487031 0.499531

BER_1 0.507734 0.499531 0.487031 0.487031

BER_2 0.073750 0.062812 0.002188 0.000625

PSNR 25.3914 35.9127 38.0789 37.5366

Table 3-2 Results of bitplane-based watermarking

3.4 Choice of Watermarking Techniques

Since our scalable protection scheme is based on the JPEG2000 standard, we will discuss the compatibility of the watermarking techniques with the JPEG2000 standard in this section. Although each watermarking technique has its own merits, not all of them are suitable for the JPEG2000 standard and some advantages of theirs may disappear due to the JPEG2000 standard characteristics. Some other implementation considerations are also discussed in this section.

Figure 3-7 Choosing embedding positions before R-D Optimization

In the JPEG2000 standard, there are three coding passes for each bit-plane coding and the coded bit-stream can be rearranged to achieve different types of scalability. That is, bits in the same bitplane may belong to different coding layers.

Furthermore, since the tier-2 coding truncates the coded bit-stream to achieve the demanded compression rate or compressed quality, bits to be truncated in a bitplane is unknown until the rate-distortion optimization is done. Thus, bits which will be preserved and bits which will be truncated are unknown at the tier-1 coding stage. For

this reason, how to choose the embedding positions for bitplane-based watermarking without perceptually affecting the image quality before entropy coding is a difficult problem. If we choose the embedding positions before the tier-2 coding as shown in Figure 3-7, the embedded information may be truncated due to the rate-distortion optimization process. If we choose the embedding positions after the tier-2 coding as shown in Figure 3-8, the truncation information of the coded bit-stream is sent to the watermark embedder and embedding positions are then decided. After watermark was embedded, the entropy coding is redone. Also, the rate-distortion optimization is redone and the truncation points may be changed. Thus, some embedded bits may be truncated again and the correctness of the watermarking positions is not assured.

Tier-1 Coding

Watermark Embedder R-D Optimization

Bit-stream Truncation Tier-2 Coding

Watermark Embedding

Tier-1 Coding Tier-2 Coding Watermarked Image Truncation Information

Deciding the Embedding

Positions Host Image

Figure 3-8 Choosing embedding positions after R-D Optimization

Based on the above discussions, the bitplane-based watermarking has the following disadvantages: high implementation complexity, low robustness (correctness), large amount of side information. High implementation complexity is

caused by redoing the tier-1 and tier-2 coding. Low robustness is due to the uncertainty of data truncation, which may lead to wrong extraction results without any attack. Large amount of side information is due to the embedding positions. Because of the low correct detection rate, some capacity is sacrificed to enhance the robustness using the error correction coding (ECC). Thus, the only advantage is the simplicity of the watermark extraction, which is the original advantage before combining with the JPEG2000 standard.

For the correlation-based watermarking, the major parameter is the strength of watermarking, which decides the robustness of the watermark. And the embedding process is very simple while the extraction process is more complex. Thus, the correlation-based watermarking has the following advantages as comparing to the bitplane-based watermarking: relatively low implementation complexity, higher robustness, a small amount of side information. Since the complexity of the correlation-based watermarking is mainly due to calculating the correlation operation in the watermark extraction stage, which is much simpler than the tier-1 and tier-2 coding, the correlation-based watermarking has relatively low implementation complexity comparing to the bitplane-based watermarking. Because the truncation of coded bit-stream in the tier-2 coding does not affect the extraction results seriously, a watermark with certain intensity should provide sufficient robustness against attacks.

The side information is not high since there are no embedding positions to be recorded.

Since the correlation-based watermarking seems to have more advantages than the bitplane-based watermarking for our purpose, it is thus chosen to be the watermarking method used in the protection scheme of scalable media.

在文檔中使用視覺上可移除的浮水印實現可調式多媒體階層式保護 (頁 49-0)