Thesis Organization - 在JPEG XR 影像上做資訊隱藏之新研究

Chapter 1 Introduction

1.5 Thesis Organization

In the remainder of this thesis, a review of related works about techniques of data hiding via images, image authentication, and visible watermarking of images, as well as a brief introduction to the JPEG XR standard are given in Chapter 2. In Chapter 3, the proposed method for data hiding for covert communication via JPEG XR images is described. In Chapter 4, the proposed method for JPEG XR image authentication is described. In Chapter 5, the proposed method for removable visible watermarking of JPEG XR images is described. Finally, conclusions and some suggestions for future works are made in Chapter 6.

Chapter 2 Review of Related Works and JPEG XR Standard

Different types of multimedia are transmitted on the network. A lot of data hiding techniques have been proposed for multimedia copyright protection or covert communication based on the properties of distinct multimedia formats. They will be reviewed in this chapter. Because the data hiding techniques and applications we propose in this study are implemented on the JPEG XR format, we will also give a brief introduction to the JPEG XR standard in this chapter.

2.1 Review of Techniques for Image Data Hiding

Many data hiding methods have been proposed [2-13] in the last decade. Wu and Tsai [2] proposed a new steganographic method for images by pixel-value differencing. They changed the difference between two consecutive pixel-values to embed secret data. Two pixels with a small difference value mean the existence of a smooth area and those with a large difference value mean the existence of a sharp one.

According to human vision’s sensitivity, they embedded more secret data in the sharp area than in the smooth area for the purpose of yielding less distortion and higher quality. Lee and Tsai [3] proposed a lossless data hiding method by histogram shifting based on an adaptive block division scheme. As shown by experimental data, they could embed more secret data and have better PSNRs than other histogram shifting

methods. Huang and Tsai [4] proposed a new data hiding method via H.264/AVC videos based on the use of optional intra-prediction modes in the H.264/AVC format.

Chang et al. [5] proposed a steganographic method via the JPEG image based upon quantization table modification. Jain and Gupta [6] proposed a JPEG compression resistant steganography scheme for raster graphics images. Barton [7] compresses the secret message before embedding them into the bit stream of digital data. Celik et al.

[8] proposed a lossless data hiding method which quantizes each image pixel by into a number of scales, compresses the quantization residues, and embeds the secret bits as well as the compressed data into the quantified image by the least-significant-bit (LSB) substitution technique. Tian [9] proposed a technique of pixel-value difference expansion by performing fundamental arithmetic operations on pairs of pixels to discover hidable space. Ni, et al. [10] proposed a reversible data hiding method which shifts slightly the part of the histogram between the maximum point (also called the peak point) and the minimum point to the right side by one pixel value to create an empty bin besides the maximum point for hiding an input message. Fallahpour and Sedaaghi [11] proposed the idea of decomposing the entire cover image into blocks and using the peak point of the histogram of each block to hide data.

2.2 Review of Techniques for Image Authentication

Authentication is a data hiding application for verifying the integrity and fidelity of an image. A lot of authentication methods have been proposed [14-16] in recent years. Yang and Tsai [14] proposed a block-based authentication method via PNG images by adjusting selected values in the spatial domain. They adjusted the sum of coefficients of 3×3 blocks to a multiple of a previously-selected value as an

authentication signal, and checked the authentication signal by extracting the remainder from the sum divided by the previously-selected value. If the remainder does not equal zero, the image will be decided to have been tampered with. Huang and Tsai [15] proposed a block-based authentication method for grayscale images by embedding invisible authentication signals in them according to the human visual model. And the standard deviation values of 3×3 blocks are used to classify each block into one of four quantization values, from smooth areas to edge ones. After classifying the quantization value of each block, the range of the grayscale is partitioned into multiple levels by the quantization value of each block. Let L be a level of the grayscale range which includes the value of the central pixel and has a lower bound value gmin. Finally, the authentication signal is embedded by replacing the value of the central pixel of each 3×3 block with gmin + γ, where γ is a pre-selected constant. Lee and Tsai [16] proposed a new authentication method for software programs based on the use of invisible ASCII control codes as authentication signals.

2.3 Review of Techniques for Visible Watermarking in Image

Watermarking is widely used for copyright protection. A large number of watermarking methods have been proposed [17-19] in the past. Chiu and Tsai [17]

proposed a method for copyright protection by watermarking for color images against print-and-scan operations using coding and synchronization of peak locations in the discrete Fourier transform domain. At first, the cover image is scaled to be a square one, and then a new coordinate system based on radiuses and angles is decided in the frequency domain. The positions P(R_i, θj) and their symmetric positions Q(R_i, θj) are used to embed the watermark, where R₁ ≤ R_i ≤ R₂, 0^o≤ θj ≤ 180^o (R₁ and R₂ are

pre-selected radiuses). Because P(Ri, θj) and Q(Ri, θj) are located in the middle frequency band, a threshold for watermark extraction can be selected. In other words, a pair of values of P(Ri, θj) and P′(Ri, θj) is replaced with a number large than the threshold for embedding the value 1. In addition, they also selected a synchronization peak Psync(Rsync, θsync) for protection against rotation and scaling attacks, where R2≤ Rsync, and θsync is a pre-selected angle value. Liu and Tsai [18] proposed a new method for generic lossless visible color watermarking based on reversible one-to-one compound mapping. Chen and Tsai [19] proposed a method for copyright protection of palette images by a robust lossless visible watermarking based on the use of color palette tables.

2.4 Review of JPEG XR Standard

In this study, all the data hiding techniques are implemented on JPEG XR images.

The detailed JPEG XR specification is described in the ISO/IEC 29199-2 document [1]. We will give a brief review of the JPEG XR standard in this section. In Section 2.4.1, the structure of the JPEG XR standard will be described. The encoding and decoding process will be described later in Section 2.4.2 and Section 2.4.3, respectively.

2.4.1 Structure of JPEG XR standard

An image is composed of a primary image plane and an optional alpha image plane according to the JPEG XR standard, as seen in Figure 2.1. The primary image plane may have multiple image channels. The first channel is defined to be a luma component. Other channels are defined to be the chroma components. The alpha image plane contains exactly one channel which controls the weight of primary image

Figure 2.1 The structure of image planes. (a) Primary image plane. (b) Alpha image plane.

Every channel is composed of four bands in the frequency domain: the DC band, the lowpass band, as well as the highpass and flexbits bands. The DC and lowpass bands stand for information of a low frequency domain. The highpass band stand for information of a high frequency domain. The flexbits band carries information regarding the low order bits of the highpass coefficients.

The JEPG XR standard defines a hierarchy-level structure which includes the image, tile, macroblock, and block levels. The basic unit in the hierarchy structure is a block which is an area of 4×4 pixels. A macroblock is the most important unit in the JPEG XR standard, which is a 16×16 area consisting of 16 blocks. All the operations of coefficient conversion are implemented in macroblocks, such as the coefficient transformation of a space domain into a frequency domain. A tile is one of the result of a partition of the image into rectangular arrays of macroblocks. Every tile is an independent part in the JPEG XR standard. The tiles will not influence each other in the coding process. The hierarchy structure of the JPEG XR image is shown in Figure 2.2.

Tiles

Macroblock

Blocks

Internal image Output

image

Figure 2.2 The hierarchy structure of JPEG XR images. The size of internal image is equal to that of the original image. The width and height are multiple of 16.

Because the macroblock is so important in the JPEG XR standard, we give a brief review here. In the frequency domain, a single macroblock contains 256 transform coefficients. The value of the left-top coefficient in a macroblock is called the DC value. The value of the left-top coefficient in each block, if not the DC value, is called the lowpass value. The remaining coefficients are called highpass coefficients. There are one DC coefficient, 15 lowpass coefficients, and 240 highpass coefficients in a macroblock. The components of a macroblock are shown in Figure 2.3.

Macroblock

Highpass Lowpass DC

Figure 2.3 The components of a macroblock.

The JPEG XR standard defines two codestream modes: the spatial mode and the frequency mode. In both modes, the metadata and information of a JPEG XR image are coded and put in an image header. An index_table behind the header indicates the start position of every tile packet, and the index_table is followed by a sequence bit streams of tile packets. Every tile packet carries the information of a tile.

In the spatial mode, the coding order of the macroblocks in a tile is a raster-scan order from the left to the right and from the top to the bottom. The bitstreams of all macroblocks are combined together.

In the frequency mode, the codestream of each tile is composed of four packets:

the DC, lowpass, highpass, and flexbits packets. Each packet carries the coefficients of one frequency band of that tile. The DC packet carries the information of the DC band of each macroblock, in a raster-scan order. The lowpass packet carries the information of the lowpass band of each macroblock. The highpass packet carries information of the highpass band of each macroblock. Finally, the flexbits packet carries the information of the flexbits band of each macroblock. The structure of the codestream according to the JPEG XR standard is shown in Figure 2.4.

IMG_HDR INDEXTBL TILE1 TILE2

MB_1 MB_2 MB_3

DC LOWPASS HIGHPASS FLEXBITS Spatial mode

Frequency mode

Figure 2.4 The codestream modes of the JPEG XR standard: the spatial mode and the frequency mode.

2.4.2 Process of encoding of JPEG XR file

In this section, we will give a sample description of the encoding process of JPEG XR image. The steps of encoding are as follows.

i. Pre-scaling.

The pre-scaling step is usually used when the input data item is greater than 27 or 24 bits. In this case, the input data will be right-shifted by m bits and reduced to 27 or 24 bits or below. When the input data item is unscaled, the maximum size of the input data is 24 bits. And the 27 bit limit is applied when the data item is scaled.

The color conversion step is used to covert OUTPUT_COLOR_FORMAT to INTERNAL_COLOR_FORMAT. In this study, we focus on the color conversion of the RGB model to the YUV. The function of converting the RGB model to the YUV is shown as follows:

The size of the image is not always a multiple of 16. When an image width or height is not a multiple of 16, a macroblock alignment and padding step is conducted to extend the right column and bottom row of the image to the nearest higher multiple of 16. Then, the encoder pads the aligned width with horizontal samples, and pads the aligned height with vertical samples.

The encoder uses a two-level structure to transform the spatial domain into the frequency domain. The transformation process is shown as follows.

(1) The level-one transformation is applied to all coefficients of 4×4 blocks of an image, which includes the following operations:

− outer pre-filtering;

− outer FCT.

(2) After the level-one transformation, the DC coefficients of 4×4 blocks are grouped together as new input data for the level-two transformation, which includes the following operations:

− inner pre-filtering;

− inner FCT.

The pre-filtering operation is used to smooth coefficients, which is optionally applied to 4×4 areas evenly straddling blocks in two dimensions. On the boundary, the pre-filtering operation is applied to 4×2 or 2×4 areas. And the pre-filtering operation does not work in the four 2×2 corners.

The FCT operation is applied to 4×4 blocks to transform the spatial domain into the frequency domain, whose function is like the DCT operation. The quantization step quantizes the transform coefficients to an integer value by division and rounding.

Finally, the encoder converts the transform coefficients to a codestream. A flowchart of the JPEG XR encoding process is shown in Figure 2.5.

Pre-scaling Color conversion

Figure 2.5 Block diagram of JPEG XR format encoding process.

2.4.3 Process of decoding of JPEG XR file

The JPEG XR decoder consists of two major parts: the parsing process and the decoding process, described as follows.

(1) The parsing process consists of steps described as follows:

i. image layer and tile layer codestream parsing;

ii. macroblock layer codestream parsing which includes parsing the transform coefficients and inverse scanning; and

iii. adaptation of VLC table selection and context models.

(2) The decoding process consists of steps described as follows:

i. coefficient remapping;

structures such as image header and compressed data of the frequency domain. These data will be inversely transformed into the original coefficients of the frequency domain by inverse scanning and the VLC table selection.

After the parsing process, the decoder re-maps the original coefficients to correct positions in the image. And the transform coefficients may be predicted from the neighboring coefficients in the coefficients prediction process. Then, the transform coefficients are scaled by the quantization parameter in the de-quantization process.

The decoder takes a two-level inverse transform from the frequency domain to the spatial domain. The inverse transform process consists of the following steps.

The coefficients of DC and lowpass bands are grouped into a DC_LP array as input data for the first-level transformation including the following steps:

− first-level inverse transform;

− when indicated, a first-level overlap filtering.

The resulting coefficients of the first-level transformation are combined with the highpass coefficients into the new input data for the second-level transformation including the following steps:

− second-level inverse transform;

− when indicated, a second-level overlap filtering.

An inverse core transform (ICT) is applied to 4×4 blocks, which is an inverse transform function of the FCT. The ICT operation transforms the frequency domain into the spatial domain, whose function is like the IDCT operation.

The overlap filtering is an inverse function of the pre-filtering, which is also optionally applied to 4×4 areas evenly straddling blocks in two dimensions. On the boundary, the overlap filtering operation is applied to 4×2 or 2×4 areas. And the overlap filtering operation does not work in four 2×2 corners.

Finally, the decoder converts the coefficients into OUTPUT_COLOR_FORMAT

for image display. The flowchart of JPEG XR decoding process is shown in Figure 2.5. In this study, we focus on the color conversion of the YUV model to the RGB.

The function of converting the YUV to the RGB is shown as follows:

t U

G Y t

R t G V

B V R

= −

= −    

= + −    

= +

(2.2)

Figure 2.6 Block diagram of JPEG XR format decoding process.

Chapter 3 Covert Communication via JPEG XR Images by Variable Macroblock Quantization

3.1 Introduction

Because of the rapid expansion of the Internet, the message transmitted through the Internet is not confidential nowadays. Private messages may be subject to sniffing attack by malicious people. To solve this problem, a common method is to use cryptography to protect the secret message. However, it is hard to conceal the action of secret communication via cryptography because the resulting stego-message usually is noise-like and often arouses suspicion from attackers. For the purpose of data confidentiality and covert communication, we propose a method to hide secret data via JPEG XR images.

In this chapter, we will describe the proposed data hiding technique via JPEG XR images. The major idea of the proposed method will be described n Section 3.1.2.

The detailed data hiding and extraction processes will be given in Sections 3.2.2 and 3.2.3, respectively. In addition, some security enhancement measures for the proposed method will be proposed in Section 3.3.2. And some experimental results will be shown in Section 3.4. Finally, we will give a brief summary of this chapter in Section 3.5.

3.1.1 Problem definition

The JPEG XR images have become more and more popular nowadays, but researches of data hiding via them are not found yet according to a survey conducted in this study. Therefore, it is desired to develop a data hiding technique via JPEG XR images. The goal is to embed the secret message into a cover image without causing noticeable distortion. It is hoped that when the stego-image is transmitted to a receiver, other people will consider the behavior of transmission just as an activity of image sharing rather than secret communication. Only the receiver can extract the secret message from the stego-image with a secret key. Therefore, the aim of data hiding in JPEG XR images is how to design a method for embedding data imperceptibly and extracting the embedded data correctly. In addition, even a person knows the algorithms of the method, he/she still cannot extract the secret data without the secret key.

3.1.2 Major idea of proposed method

The proposed method is essentially based on the utilization of certain special characteristics of the JPEG XR format in the quantization of the FCT (forward core transform) coefficients in the frequency domain. In this aspect, the quantization process of the JPEG XR standard is different from that of the old JPEG standard. In the latter, a fixed 8×8 quantization table is used to quantize the DCT coefficients of every 8×8 blocks. Unlike this, the new JPEG XR standard supports variable quantization parameters for image quantization.

More specifically, before an image is compressed with tiles as units according to the JPEG XR standard, the following three sets of quantization parameters must be provided by the system for each color channel, called a component, of each tile for the

purpose of quantizing the FCT coefficients yielded by the compression process:

(1) a DC quantization parameter;

(2) a lowpass quantization parameter set SL; (3) a highpass quantization parameter set SH.

After these parameter sets are used to quantize the FCT coefficients, four quantization value for smooth areas and a small one for sharp areas will reduce data volumes and improve image qualities.

Additionally, each of the highpass and the lowpass parameter sets SH and SL is allowed to include at most 16 parameters, each of which is user-selectable for use in quantizing the FCT-coefficients for various application requirements. Use of these 16 parameters variably means that we may use them to encode 4 message bits for the purpose of data hiding. And this is just the idea of using variable quantization in JPEG XR image compression we propose for data hiding in this study.

Furthermore, for the purpose of reducing distortion, only the highpass band is used in the proposed data hiding method. This way provides an advantage which is

在文檔中在JPEG XR 影像上做資訊隱藏之新研究 (頁 18-0)