The importance of shape-adaptive and direction-adaptive functionalities in

CHAPTER 4 THE PROPOSED METHODS

4.2 S HAPE -D IRECTION -A DAPTIVE DWT (SDA-DWT)

4.2.3 The importance of shape-adaptive and direction-adaptive functionalities in

For object-based image compression, the visual object images are usually neither rectangular nor with height and width which are powers of 2. If the convolution-based DWT is used for object image compression, then two techniques are often adopted in

processed by the convolution-based DWT, and the other method is to reallocate the object pixels for applying the convolution-based DWT. For example, we can pad 6 zeros (or other predefined values) to the object in Figure 4.3 to form a 4×4 image, or we can rearrange the object to be a 1-D signal with 10 discrete samples. It is clear that, both methods are not efficient. Therefore, SA-DWT and LSA-DWT were designed for object-based image compression.

Figure 4.13 An 4×4 image segment and the direction that the odd samples can be predicted perfectly.

The importance of direction-adaptive functionality can be well explained by using the 4×4 image segment in Figure 4.13, assuming the values denoting the graylevel values.

For wavelet-based image compression, a DWT transformed image can be coded efficiently by some coders (such as SPIHT or SPECK), if its high frequency subbands contain coefficients with small amplitude and low frequency subband has coefficients with large amplitude. We know that, in lifting 1-D 5/3 DWT, the high frequency subband

prediction step the odd samples will be zeros which will make the coefficients in high frequency subband become zeros after subsampling step. The direction shown by the dash line in Figure 4.13 can have perfect prediction, since the graylevels of pixels along this line are all 41. Thus the prediction value is also 41, and the residual value (Eq. 4.1) is zero for each odd sample along this line.

CHAPTER 5 EXPERIMENTAL RESULTS OF SDA-DWT

The original application of SDA-DWT was for the object image compression of the intra frames of videos, but its applications can be extended to any wavelet-based fields.

The object-based image compression application is introduced in Section 5.1 first. Then, in Section 5.2 we will apply SDA-DWT to the regular still image compression.

5.1 Object Image Compression

In this section, three test object images (Figs. 2.14, 5.1, and 5.2) are used for simulation to evaluate the performance of SDA-DWT, LSA-DWT, and DA-DWT. The original sizes of test images 1 (Figure 2.14) and 2 (Figure 5.1) are 256-by-256 pixels, and the third test image (Figure 5.2) is 128-by-128 pixels. Although the video frame size in MPEG-4 is 360-by-288, we choose square images in order to reduce the bits used for coding the paths in SPECK coding. For comparison, all methods (i.e. LSA-DWT, DA-DWT, and SDA-DWT) use the same 5/3 wavelet, and both LSA-DWT and SDA-DWT use symmetric extension for transform calculation while DA-DWT uses symmetric extension for transform calculation only on the boundary between the object image and background. For the partition boundaries in the object image, DA-DWT uses the practical values at the extension points. Here, we ignore the bits for side information (i.e. the partition of DA-DWT, the shape masks of LSA-DWT, and the partition and shape mask of SDA-DWT) for simplification and focusing on the main problem. The decomposition-level decision in wavelet transform is important and difficult. For a

efficient. However, excessively many levels can not improve the overall compression efficiency, since the LL subband becomes a very small region that may degrade the overall compression efficiency. The suitable number of wavelet decomposition levels mainly depends on the image size, image content, and the coder/decoder used. In most cases, for a 512-by-512-pixel image, we select 3, 4, or 5 levels empirically. In this dissertation, 4 decomposition levels were used because the test images are small size. In the followings, PSNR (peak-signal-to-noise ratio) values and the lengths of bit streams after SPECK coding are used as two performance measures. The PSNR calculation is based on a 256-by-256-pixel image (objects 1 and 2) or a 128-by-128-pixel image (object 3), and the bpp (bit/pixel) calculation is based on the pixel number in an object image.

Figure 5.1 The 256 × 256 gray-level object image and its shape mask with partition: (a) the test object image, (b) the mask with partition. (Object 2 contains 45,012 pixels.)

Figure 5.2 The 128 × 128 gray-level object image and its shape mask with partition: (a)

For the first object image (Figure 2.14), our interested object is a suitcase covered with many line textures, and the object occupies 30,353 pixels in a 256-by-256-pixel image. SDA-DWT and LSA-DWT are evaluated by compressing object-1 image. Since the orientations of lines in object 1 are almost the same, we do not partition the object into small segments, i.e. the whole visual object is a large segment. After performing 4-level SDA-DWT on the visual object, the transformed object image is coded by using the SPECK algorithm, and the resulted bit-stream can represent a compression file of the object image. The same procedures are performed on test image 1 except that SDA-DWT is replaced by SA-DWT, and we have another compression file of the object image by using LSA-DWT. Table 5.1 shows the sizes (in bits) of each object image for each method, and it tells us that SDA-DWT is more efficient than LSA-DWT is. The bit number of SDA-DWT compression file is about 77.8% size of LSA-DWT compression file. Table 5.2 shows that SDA-DWT outperforms LSA-DWT up to 5.88 dB under 2.15-bpp (256×256 bits) condition. In this case, the performance of SDA-DWT is always better than that of LSA-DWT because of the directional line textures on the object. For the characteristic of the textures on object 1, if we choose +45^odirection in the prediction step of the 1-D ‘horizontal’transform, the predicted values will very close to the actual values of odd pixels. Thus, much energy is clustered in the low-frequency subband, and that makes the wavelet transform very successful, which makes the overall compression scheme very efficiently.

Table 5.1 The bit numbers of the bit stream of each test object image after SPECK coding. (SDA¹ and SDA² represent SDA-DWT without object partition and with object partition, respectively. LSA means LSA-DWT and DA is DA-DWT.)

Object image Object 1 Object 2 Object 3

SDA¹ 123,341 bits 243,002 bits NA

SDA² NA 244,729 bits 67,726 bits

LSA 158,530 bits 245,330 bits 71,556 bits

DA NA NA 77,209 bits

Table 5.2 The PSNR results for lossy compression of object image 1. (Object 1 contains 30,535 pixels.)

Rate (bpp) 1.00 2.15 3.22

SDA 36.82 dB 42.75 dB 48.30 dB

LSA 31.06 dB 36.87 dB 42.54 dB

Table 5.3 The PSNR results for lossy compression of object image 2. (Object 2 contains 45,012 pixels. SDA¹ and SDA² represent SDA-DWT without object partition and with object partition, respectively.)

Rate (bpp) 1.00 1.46 2.91

SDA¹ 28.05 dB 28.88 dB 39.45 dB

SDA² 28.11 dB 29.02 dB 39.45 dB

LSA 27.52 dB 28.73 dB 39.30 dB

Table 5.4 The PSNR results for lossy compression of object image 3. (Object 3 contains 10,000 pixels. SDA²represents SDA-DWT with object partition.)

Rate (bpp) 1.00 1.64 3.28 4.92

SDA² 22.41 dB 22.91 dB 33.29 dB 43.02 dB

LSA 18.10 dB 22.58 dB 31.79 dB 40.55 dB

DA 17.53 dB 22.29 dB 27.85 dB 38.01 dB

For the test image of object 2 (Figure 5.1(a)), SDA-DWT and LSA-DWT are simulated and compared by their PSNR values and file sizes. The gray-level object 2 is segmented from the famous test image Barbara, and Figure 5.1 (b) shows the shape mask of the visual object. Two cases are simulated for evaluating SDA-DWT. First, the whole object 2 without partition is used for simulation, and second, object 2 is partitioned into two parts (Figure 5.1 (b); the white region and the gray part) for simulation. The partition shown in Figure 5.1 (b) is an example, which is partitioned manually, for arbitrarily shaped partition which is not the optimal one. Table 5.1 shows that, for compression-file size, SDA-DWT with object-image partition is the most efficient case among these cases, SDA-DWT without object partition is second place, and LSA-DWT is third place.

SDA-DWT with object partition reduces 0.95% bit-budget of LSA-DWT’s, and SDA-DWT without object partition reduces 0.24% bit-budget. On the other hand, the

The results show that for a texture rich (especially, non-horizontal or non-vertical edges) image, the performance of lossy compression can be enhanced by suitably partitioning the object image. The proposed method offers much flexibility for partition, since it can handle segments with any shape. The reconstruction object images of SDA-DWT with object partition and LSA-DWT, under 1.46-bpp condition, are shown in Figure 5.3. We also performed the experiments on the object images segmented from Lena, Claire, and Akiyo. Since these object images lack non-horizontal or non-vertical edges or the directions of textures are random, without suitable object partition, the performance of SDA-DWT and LSA-DWT are almost the same.

Figure 5.3 The object-2 reconstruction images under 1.46-bpp condition: (a) the result of SDA-DWT with object partition according to Figure 5.1 (b), (b) the result of LSA-DWT.

Figure 5.4 The reconstruction object images, under 1-bpp condition: (a) the result of SDA-DWT, (b) the result of LSA-DWT.

Figure 5.5 The reconstruction object image and the partition and direction in DA-DWT:

(a) the reconstruction result under 10,000-bit condition. (b) mask partition and block directions of partition used in DA-DWT.

For the third gray-level object image (Fig 5.4 (a)), all the three methods (LSA-DWT, DA-DWT, and SDA-DWT) are evaluated. The object-3 image (synthesized from the images from USC image database) contains 10,000 pixels in a 128-by-128-pixel area, and there are five different textures on the object. Hence, the object image is partitioned into 5 segments (Figure 5.4 (b), this is synthesized from the image in the USC image database, so it is given in this experiment.) for SDA-DWT. Although object-3 image is rectangular, SDA-DWT can handle any shaped objects. DA-DWT is originally designed for processing a rectangular image, but object-3 image can be viewed as a squared 128-by-128-pixel image containing an object 3. DA-DWT partitions the object image into many small blocks (Figure 5.5 (b)) to discover the texture direction which can not be seen in large scale. Table 5.1 shows that, for lossless compression, SDA-DWT uses the least amount of bits, and DA-DWT is the most bit consuming one. For PSNR comparison, Table 5.4 shows that SDA-DWT outperforms LSA-DWT up to 4.31 dB in PSNR under 1.00-bpp (bit / object pixel) condition, and reduces the bit-budget up to 5.7%

for lossless compression. SDA-DWT also outperforms DA-DWT up to 5.44 dB in PSNR

experiments of object 3, we understand that LSA-DWT can not well exploit the correlation of the directional textures, so it has poor performance for this test object image.

For DA-DWT, since its resolution is not high enough (the smallest partition block is 16-by-16) and can not approximating the non-rectangular segment boundaries well, DA-DWT has the poorest performance for the special object-image.

5.2 Regular Image Compression

SDA-DWT was originally designed for object image compression, but it can be used for a regular (rectangular or square) image by extending the mask (or alpha maps) to cover the whole image. For example, Figure 2.14 (b) is a partition mask for the visual object, and we can use Figure 5.6 as a partition mask of the whole image for SDA-DWT.

In Figure 5.6, the regular image is partitioned manually into three large connected segments which are distinguished by three different graylevel values. For the test image (Figure 2.14 (a)), three compression methods, which are SDA-DWT, DA-DWT, and conventional-direction lifting DWT, will be simulated and compared. All the three methods use the lifting 5/3 DWT discussed in Chapter 4, and the third method uses the conventional filter directions, i.e. horizontal and vertical directions. First, the lossless compressed file sizes of these three methods are compared. Then, we will compare the PSNR values (dB) under several bpp conditions. Finally, we also want to examine the reconstruction results of these three methods, since, sometimes, the PSNR values are not reflect the real visual quality.

Test image Figure 2.14 is a 256×256 graylevel image, and it use 8 bits to represent the gray levels of a pixel. Since the test image is small size, we choose to use 4 decomposition levels. After transformed by any one of the three methods, the transformed image is coded by using the SPECK coder. The result bitstream of the SPECK coder is our lossless compression file for each method. The experimental results showed that the

conventional-direction lifting DWT, respectively. Ignoring the side information, SDA-DWT has the smallest lossless compression file.

Figure 5.6 An example of partitioning the image in Figure 2.14 (a) for regular image compression. The suitcase image is partitioned into background, the handle, and the box

manually.

Table 5.5 PSNR values of three method under 0.1-bpp, 0.25-bpp, 0.5-bpp, and 1.0-bpp conditions, where CD-DWT means the conventional-direction lifting DWT.

bpp SDA-DWT DA-DWT CD-DWT

0.1 22.52 dB 21.92 dB 22.43 dB

0.25 22.94 dB 22.52 dB 22.96 dB

0.5 28.54 dB 27.51 dB 27.82 dB

1.0 34.10 dB 32.85 dB 32.96 dB

From the data in Table 5.1, we know that SDA-DWT has the best (highest) PSNR values in all cases, but it is interesting that DA-DWT is third place. The results showed that the partition in direction DWTs is critical, and locally optimal is not equal to global optimal. Although DA-DWT is inferior to lifting conventional-direction DWT for PSNR value in this experiment, but, for 0.25-bpp and 0.5-bpp, the reconstruction images of

Figure 5.7 Reconstruction images of three methods under 1-bpp and 0.5-bpp conditions respectively, where SDA means SDA-DWT, DA is DA-DWT, and CD-DWT denotes

lifting conventional-filter-direction DWT.

Figure 5.9 shows another example whose original image is the Pentagon. For the Pentagon image (Figure 5.9 (a)), there are two partition methods shown in Figure 5.9 (b) and (c), respectively. The partition shown in Figs. 5.9 (b) is based on the partition method of [4] which partitions an image into some small rectangular blocks, and we joint the blocks with the same local filter direction to form 9 types of segments. Note that se4gments belonged to the same filter direction are not necessarily connected. The gray-level value in Figs. 5.9 (b) and (c) is used to represent the filter direction of each of the nine types of segments. The smallest pixel value corresponds to θ= -71.5°, and the

segmentation- direction relation. Since SDA-DWT does not require the boundary of each segment to be vertical or horizontal, we can partition the image according to any requirement need. For example, a human body can be partitioned based on his body shape or other ways that is meaningful to us.

Figure 5.8 Reconstruction images of three methods under 0.25-bpp and 0.1-bpp conditions respectively, where SDA means SDA-DWT, DA is DA-DWT, and CD-DWT

denotes lifting conventional-filter-direction DWT.

SDA-DWT outperforms DA-DWT because the former allows flexible partition on shape and size. A simple method can guarantee that SDA-DWT obtains better performance than DA-DWT can do is that use the partition of DA-DWT as a initial

partition the image into many small triangular segments which should be able to better exploit the directional correlation in the image. For many natural or artificial images, the best partition boundaries are usually not vertical or horizontal lines, so SDA-DWT are suitable for such cases.

Figure 5.9 A 256×256 image and its partitions: (a) the original image Pentagon, (b) partitioning the image into 9 types of blocks, (c) another non-block partition.

Finally, SDA-DWT (5/3 wavelet) is compared with the lifting 5/3-wavelet conventional-directional DWT by transforming Figure 5.7 (a) with 3 decomposition levels. The results are shown in Figure 5.10, and we can see that less energy is left in the high frequency subbands for SDA-DWT because the high-frequency-subband image of Figure 5.9 (a) are darker than that in Figure 5.9 (b). Hence, generally speaking, SDA-DWT can well exploit the correlation existing in the image and achieve higher coding efficiency in image compression. The maximum amplitude of the transformed coefficients in Figure 5.9 (a) is 423.5, and the maximum coefficient value in Figure 5.9 (b) is 222.8. It usually means good energy compaction and a better transform in image compression, if the coefficient amplitudes are large in the low-frequency subband.

(a)

(b)

Figure 5.10 Transformed images of Figure 5.7 (a): (a) by SDA-DWT, (b) by conventional-direction lifting DWT. (both use 5/3 wavelet and 3 decomposition levels)

Note that the high-frequency subband of (a) is smoother than (b), and it means that the former transform is usually more efficient than the later.

CHAPTER 6 CONCLUSIONS

In this dissertation we discuss lifting-based DWT and propose SDA-DWT, which can be used for arbitrarily shaped image segments, and whose direction of prediction and update are adaptive. The conclusions and comments (or future works) of these two topics are presented in the followings in this chapter.

For lifting-based DWTs, we discussed how every wavelet filter pair can be decomposed into lifting steps. The decomposition is equivalent to present the polyphase matrix and dual polyphase matrix as products of elementary matrices (i.e. lower triangular and upper triangular matrices), which was known to be possible by mathematicians long time ago. Compared to conventional implementation, lifting structure can lead to a speed-up. The lifting structure also allows for an in-place realization of the fast wavelet transform, so the wavelet transform can be computed without allocating auxiliary memory. In a lifting step, all operations can be done totally parallel, and the only sequential part is the order of the lifting operations. For hardware implementation and lossless image compression, lifting structure is important because it is easier to build on non-linear wavelet transform and wavelet transforms which map integers to integers. And, it is possible to integrate biorthogonal wavelets with scalar quantization and also keep cubic quantization cells which are optimal like in orthogonal cases, by using both the lifting and integer-to-integer transforms. At last, the special feature of lifting, which helps us to develop SDA-DWT, is that lifting allows for adaptive wavelet transform. Therefore, one can start the analysis of a function form the coarsest levels and build the finer levels by refining the region of interest.

Since the results of factoring lifting steps are not unique, we should know what the

polyphase and dual polyphase matrices, can not work for integer or dyadic-number filter coefficients. Hence, we can not assure that filters with binary coefficients are able to be factored into lifting steps with binary filter coefficients. Any invertible polyphase matrix which has a non-identity polynomial on the diagonal can be obtained using lifting, but some of the advantages of lifting structure discussed above rely on the identity diagonal requirement.

For the proposed SDA-DWT, since it can well exploit the correlation because of spatial orientation and handle regions with any shape and size, SDA-DWT has superior performance than SA-DWT or DA-DWT does for visual objects with non-horizontal or non-vertical edge textures. SDA-DWT can be applied to any wavelet-based application, although, in this dissertation, we only give three application examples. The extra costs of SDA-DWT compared to SA-DWT are the increased complexity and the storing and processing of the side information of the directions in each segment of the object image.

For convenience, we focus on how to compress the partitioned still-object image while assuming that the partition of the object image has been done in this work. In order to achieve the optimal result, a good texture-segmentation method is necessary. The optimal partition depends on the image to be compressed, and it is usually not the case in DA-DWT (i.e. rectangular blocks). Partitioning an image into many small rectangular blocks is usually not the optimal partition, and the reconstruction result is possible to

在文檔中提昇型離散小波轉換之研究及其在影像壓縮中之應用 (頁 92-0)