Computational Complexity - Enhanced Wavelet-based Contourlet Image Coding

Chapter 4 Enhanced Wavelet-based Contourlet Image Coding

4.2.5 Computational Complexity

We now look at the computational complexity issue of our decision algorithm. We examine the amount of multiplications and additions for the steps in Fig. 4-5. We assume that the input image size is S=W×H. Herein, W is the width of the input image and H is the height. We also assume that W and H are all power of 2 and we can implement the 2-D DFT in the radix-2 fast Fourier transform (FFT) structure.

1) In Step A of Fig. 4-5, we apply 2-D DFT to the input image, obtain its energy spectrum, and then apply a smoothing filter to the spectrum. The 2-D DFT is implemented by the radix-2 FFT, and thus the required numbers of real-value additions and multiplications are given by (4-22) and (4-23), in which ceil(x) means the smallest integer greater than or equal to x. Next, Eq. (4-12) needs 1 real addition and 2 real multiplications to calculate the energy of a DFC. For the entire image, the

(a) (b) (c)

required numbers of real additions and real multiplications are in (4-24) and (4-25).

The smoothing operator in Fig. 4-7(c) requires 8 real additions and 1 real multiplication for each c(x, y). Thus, the total numbers of real additions and multiplications are given by (4-26) and (4-27). Finally, the overall numbers of real

additions and real multiplications in Step A are (4-28) and (4-29).

_ _ _

_ _ _ _ _ _

2 2

3 (ceil(log ) ceil(log ))

real addition in DFT

complex multiplication in DFT complex addition in DFT

2 (ceil(log ) ceil(log ))

real multiplication in DFT

complex multiplication in DFT

real addition in calculating power

N WH (4-24)

_ _ _ _ 2

real multiplication in calculating power

N  WH (4-25)

_ _ _ _ 8

real addition in smoothing spectrum

N  W H (4-26)

_ _ _ _ 2

real multiplication in smoothing spectrum

N  W H (4-27)

_ _ _ 3 (ceil(log2 ) ceil(log2 )) 8

real addition in stepA

N    W H W  H     W H W H (4-28)

_ _ _ 2 (ceil(log2 ) ceil(log2 )) 2

real multiplication in stepA

N    W H W  H     W H W H (4-29)

2) Step B chooses the representative energy levels based on the low frequency zone. Eqs. (4-15) and (4-16) calculate the mean of the DFC energy in the low frequency zone. The heights of the low frequency zones in LH¹ and HL¹ are (ceil(H/4)+1) and (ceil(W/4)+1), respectively. The width is Wlfz. Thus, the mean calculation (Step B) needs 2 divisions and Nreal addition in stepB_{_} _{_ _} real additions as shown

in (4-30). We choose Wlfz=3 when S=512×512.

   

 

_ _ _ ceil / 4 ceil / 4 2 2

real addition in stepB lfz

N W  H  W   (4-30)

3) Step C decides the thresholds for directional subbands. The DFC number in each directional subband is W H /16, thus the DFC number in each half directional subband is W H / 32. In addition to 2 real divisions, we need W×H/32 real multiplications and (W×H/16-2) real additions to calculate the mean and the variance of each half directional subband. LH¹ and HL¹ together have 8 directional subbands in total. The numbers of real additions and real multiplications in Step C are, therefore, given by (4-31) and (4-32).

_ _ _ 8 ( /16 2) / 2 16

real addition in stepC

N   W H   W H  (4-31)

_ _ _ 8 / 32 / 4

real multiplication in stepC

N   W H  W H (4-32)

4) Step D identifies the energy peaks. Eq. (4-21) needs 1 division, 242 multiplications and 480 additions. In total, the numbers of real additions and real multiplications in Step D are in (4-33) and (4-34), wherein Nit is the iteration number.

In our experiments, the minimal Nit is 11 (test image Baboon), the maximal Nit is 12487 (test image Barbara), and the average Nit is 1697.

_ _ _ 480

real addition in stepD it

N N  (4-33)

_ _ _ 242

real multiplication in stepD it

N N  (4-34)

All in all, (4-35) and (4-36) give the total number of multiplications and

additions in the decision procedure. When S=W×H=512×512, Wlfz=3, Nin=1697, the

total number of real additions and real multiplications are 17,461,460 and 10,699,826.

Ntotal_real_addition =

W×H×(3×ceil(log2W)+3×ceil(log2H)+9+1/2)−16+Wlfz×(ceil(H/4)+ceil(W/4)+2)−2+Nit×480 (4-35)

Ntotal_real_multiplication = W×H×(2×ceil(log2W)+2×ceil(log2H)+3+1/4)+Nit×242 (4-36)

Table 4-3. Computational complexity and run time for the systems with and without decision when LLF is adopted.

Table 4-4. Computational complexity and run time for the systems with and without decision when SLF is adopted.

Table 4-3 and Table 4-4 show the computational complexity and the run time of the entire system with and without decision, wherein the directional filters are LLF and SLF, respectively. With decision, the fastest case occurs when no directional transform is conducted on LH¹, HL¹, and HH¹. And the slowest case occurs when we apply the directional transform to all subbands. In Table 4-3, the image coding scheme with LLF and decision may save over 84% computational load or 88% run time in the fastest case. In the slowest case, the decision process requires an additional 16%

computational load or 13% run time. In Table 4-4, the image coding scheme with SLF

LLF without

114,819,072 10,699,826 125,518,898 9.32% 109.32%

Number of Additions

114,819,072 17,461,460 132,280,521 15.21% 115.21%

Run Time 11.613 sec 1.385 sec 13.012 sec 11.93% 112.05%

SLF without

13,369,344 10,699,826 24,069,170 80.03% 180.03%

Number of Additions

13,369,344 17,461,460 30,830,804 130.61% 230.61%

Run Time 2.662 sec 1.385 sec 4.055 sec 52.03% 152.33%

and decision saves about 48% run time in the fastest case and consumes 52% extra run time in the slowest case. On the average, the image coding schemes with decision require less run time.

4-3 New ZC Context Tables for 3-D ESCOT

Arithmetic coding methods encode the transformed/quantized coefficients into a bit-stream. 3-D ESCOTis a bit-plane coding method and it uses its neighbors for the

context model. Let the sequence x^N= {xN, xN-1, …, x2, x1} represents one bit-plane of a coefficient block. Because the bit-plane consists of binary symbols, i.e., x_i{0,1}, the

minimum code length of a binary sequence estimated based on the information theory is shown in (4-37), wherein P(xi|x^i-1) is the conditional probability of xi given x^i-1= {xi-1, xi-2, …, x2, x1}. Clearly, x^i-1 is the subset of x^N. Assuming x^N is a Markov random sequence of some finite order, we then can reduce the size of x^i-1down to x^i-1, which is a subsequence of x^i-1. This x^i-1 is the context model support [36][38]. Typically, x^i-1 includes the neighbors and the (bit-plane) parents of xi. Ideally, the optimal context

model gives the maximum mutual information [68].





 

 ⁿ

i i x x P L

2 ( | )

log (4-37)

The original 3-D ESCOT considers only the 2-D DWT coefficients in the horizontal and the vertical directions. Yet, the coefficients in a certain directional subband may cluster along one specific direction (different from the vertical or horizontal directions). The original context table fails to handle this case well.

Therefore, we redesign the context models of 3-D ESCOT.

Fig. 4-11. (a) The directional subbands produced by WBCT. (b) The spatial neighbor directions for coefficient A.

In Fig. 4-11(a), the 13 subbands produced by WBCT are labeled as “LL”, “HH 4-0”, “LH 4-0”, “HL 4-0”, and likewise. In Fig. 4-11(b), the edges passing through A can be H-A-H (0^O), V-A-V (90^O), D1-A-D1 (45^O), and D2-A-D2 (-45^O). We denote the 0^O, 90^O, 45^O, -45^O directions as “H”, “V”, “D1”, and “D2”, respectively.

In Fig. 4-12, we examine the effect of the directional filter LH 4-0 (DF_LH 4-0).

A concentric-circle pattern, which has edges of all directions, is used as the input pattern. Fig. 4-12(a) and (b) show this input signal and its frequency spectrum. Fig.

4-12(c) shows the spatial filter impulse response of DF_LH 4-0, which is roughly

along the H direction (slightly tilted to the D2 direction). Fig. 4-12(d) shows the filter frequency magnitude response of DF_LH 4-0, whose energy clusters mainly along the vertical axis. In Fig. 4-12(e), the filtered output image contains mainly the spatial edges aligned with the H direction (slightly tilted to the D2 direction). Fig. 4-12(f) shows the frequency spectrum of filtered signals. Evidently, the dominated directions of the LH 4-0 outputs are H and D2. Hence, “H and D2” are the filtered directions of LH 4-0.

Fig. 4-12. (a) Input signal in spatial domain. (b) Input signal in frequency domain. (c) Filter response of DF_LH 4-0 in spatial domain. (d) Filter response of DF_LH 4-0 in frequency domain. (e) Output signal in spatial domain. (f) Output signal in frequency domain.

Similarly, we identify the filtered directions of the other directional subbands.

The filtered directions of LH 4-1 are “H and D1”, those of HL 4-2 are “V and D2”, and those of HL 4-3 are “V and D1”. The filtered directions of the four corner subbands (LH 4-2, HH 4-3, HH 4-1, and HL 4-0) are D2. And those of the other four

100 200 300 400 500

corner subbands (LH 4-3, HH 4-2, HH 4-0, HL 4-1) are D1.

3-D ESCOT uses three types of context models or context tables – the zero coding tables (ZC), the sign coding tables (SC) and the magnitude refinement tables (MR). 3-D ESCOT codes bit-planes from the most significant bit-plane to the least significant bit-plane. 3-D ESCOT starts with ZC to code the beginning zeros until it hits the first non-zero bit. 3-D ESCOT uses ZC to code the magnitude of the first non-zero bit and SC to code its sign. For the remaining bits, 3-D ESCOT uses MR to code their magnitudes. To match the characteristics of the WBCT coefficients, we alter the ZC context table in 3-D ESCOT. For the coefficients in the ordinary 2-D wavelet subbands, we adopt the ZC context table (Table 4-5) in EBCOT [35]. But for the coefficients in the directional subbands, the proposed Table 4-6 is the ZC context table.

In Table 4-5 and Table 4-6, each “context” denotes a model, and the numbers of non-zero coefficients are listed under the directions, H, V, and D1+D2, and X denotes

“Don't care”. Fig. 4-11(b) shows the neighbors and their notations we use in the entropy coding. The neighbors include vertical neighbors (V), horizontal neighbors (H), left-lower and right-upper neighbors (D1), and left-upper and right-lower neighbors (D2). To code coefficient A in a wavelet subband of a bit-plane, we first calculate the number of non-zero coefficients in all directions. For 2-D wavelets,

based on the subband location and the non-zero coefficient patterns, we decide which context in Table 4-5 is to be used to code this bit of coefficient A. Similarly, we code the coefficients in the other directional subbands using Table 4-6.

Table 4-5. ZC context table for 2-D wavelet subbands

Table 4-6. ZC context table for directional subbands

Fig. 4-13 shows the frequency responses of the WBCT directional filters. We notice the aliasing phenomenon in WBCT [69]. Because the directional filters are not ideal filters, their outputs contain aliasing components. Thus, the outputs of a certain filter populate not only along one direction but also along another direction (with less energy). Consequently, the context model in arithmetic coding becomes less accurate or its coding efficiency is reduced. We may reduce aliasing by adopting a sharper (and thus longer) filter but the computation time would then increase.

wavelet subband

LL LH HL HH

context H V D1+D2 V H D1+D2 H+V D1+D2

8 2 X X 2 X X X ≥3

7 1 ≥1 X 1 ≥1 X ≥1 2

6 1 0 ≥1 1 0 ≥1 0 2

5 1 0 0 1 0 0 ≥2 1

4 0 2 X 0 2 X 1 1

3 0 1 X 0 1 X 0 1

2 0 0 ≥2 0 0 ≥2 ≥2 0

1 0 0 1 0 0 1 1 0

0 0 0 0 0 0 0 0 0

Directional subband

LH 4-0 LH 4-1 HL 4-2 HL 4-3 LH 4-3 HL 4-1

HH 4-0 HH4-2

LH 4-2 HL 4-0 HH 4-1 HH 4-3

context D2+H V D1 D1+H V D2 D2+V H D1 D1+V H D2 D1 H+V D2 D2 H+V D1

8 ≥2 X X ≥2 X X ≥2 X X ≥2 X X 2 X X 2 X X

7 1 ≥1 X 1 ≥1 X 1 ≥1 X 1 ≥1 X 1 ≥1 X 1 ≥1 X

6 1 0 ≥1 1 0 ≥1 1 0 ≥1 1 0 ≥1 1 0 ≥1 1 0 ≥1

5 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0

4 0 2 X 0 2 X 0 2 X 0 2 X 0 ≥2 X 0 ≥2 X

3 0 1 X 0 1 X 0 1 X 0 1 X 0 1 X 0 1 X

2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2

1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Fig. 4-13. Frequency magnitude responses of (a) LH 4-0 (b) LH 4-2 (c) HH 4-0 (d) HH 4-2.

Fy (πradian)

(a) Fx (π radian) (b) Fx (π radian)

Fy (πradian) Fy (πradian)

在文檔中以方向性小波轉換為基礎之影像與視訊編碼 (頁 78-88)