• 沒有找到結果。

An improved pyramid algorithm for synthesizing 2-D discrete wavelet transforms

N/A
N/A
Protected

Academic year: 2021

Share "An improved pyramid algorithm for synthesizing 2-D discrete wavelet transforms"

Copied!
6
0
0

加載中.... (立即查看全文)

全文

(1)

AN IMPROVED PYRAMID ALGORITHM

WAVELET TRANSFORMS

FOR SYNTHESIZING 2-D DISCRETE

Chu Yu and Sao-Jie Chen Department of Electrical Engineering

National Taiwan University Taipei, Taiwan, R.O.C.

Abstract

-

The Pyramid algorithm (PA) has been shown very suitable for computing 2-D forward and inverse discrete wavelet transforms (DWT). In this paper, we present a new 2-D synthesis PA to improve some defects encountered in the classical PA algorithm that usually requires large latency, long computation time, and big memory space. Unlike the PA algorithm which computes a 2-D IDWT level by level, our proposed algorithm performs a 2-D DWT in word size. Thus, for processing an

N x N 2-D IDWT with m levels and Gtap filters, the proposed algorithm needs a latency of 3mi-4, computes only in N 2 clock cycles, and spends 2NL+4(m-1) memory space.

1. INTRODUCTION

In recent years, there have been a number of studies on wavelet transforms for signal analysis and synthesis [l-31. Many algorithms for computing wavelet transforms have been proposed. For instance, Mallat presented 1-D and 2-D DWTADWT pyramid algorithms. Later, Vishwanath et al. [5, 61 developed the recursive pyramid algorithm (RPA) for 1-D DWTDDWT, and proposed the modified recursive pyramid algorithm (MRPA) to improve their preceding algorithm. Soon after that, Chakrabarti et al. [6,7] also described an extension of

1-D MRPA algorithm in 2-D case.

The 2-D inverse discrete wavelet transform (IDWT), which block diagram is illustrated in Fig. 1, can be implemented by using a pyramid algorithm (PA) [ 1,4].

However, this 2-D PA is a separable extension of the 1-D PA in a level-by-level manner for the computations of 2-D IDWT's, thus it induces large latency, long computation time, and big memory space. Owing to these shortcomings, the PA is unsuited not only for hardware realization, but also for application in real-time signal processing. To overcome, we present a new improved pyramid algorithm for the 2-D IDWT, which has low latency, near-optimal computation time, and less memory space. As a result, the proposed algorithm is one of the better choices for scheduling the computations of 2-D IDWT's.

(2)

2. The proposed algorithm

can be used to implement the 2-D IDWT as shown in the following pseudocode: The 2-D pyramid algorithm was developed by Mallat [l, 41. The algorithm

begin [ 2-D synthesis PA) input: x(l..N, l..N); for ( m = log(N) to 1)

[Do the separable mth level of 2-D filterings after all the separable

(m+ 1)th level of 2-D filterings have been done]; end [ 2-D synthesis PA)

In the above algorithm, we assume that the first level has the finest resolution and the log(l\r)-th level the coarsest resolution. Clearly, this synthesis algorithm uses a level-by-level manner to accomplish the computations of 2-D IDWT's. The advantage of the PA algorithm is itis ease of implementation. However, this algorithm has some defects. For instance, it requires a large number of memory space to store the intermediate results between row and column transforms, and the latency is too long because the first output data is produced only when the previous log(N)-1 levels of 2-D IDWT's have been completely generated. Another defect of this algorithm is its long computation time. For an N

x

N input image processed through m levels of 2-D IDWT's, the computational complexity for iterating

on

the lowpass only filtering is given by:

According to the above equation, the upper bound on the number of lowpass (or

highpass) operations is 8 N Z / 3 . Thus, based on this PA algorithm, it is unsuited for real-time signal processing due to its long latency and computation time. Moreover, it is unsuited for single-chip VLSI implementation, because it requires a large number of memory space.

In order to alleviate these defects, we present a novel algorithm, called the Recursive Quarter-Tree Pyramid Algorithm (RQTPA), to improve the performance of the classical PA algorithm. This RQTPA algorithm also performs the 2-D interpolation filterings based on a separable approach, but it does not synthesize 2- D IDWT level-by-level.

Unlike the classical PA, the proposed algorithm breaks down each subband of data into many subblock units, as shown in Fig. 2, to synthesize the corresponding level. The size of a subblock used to synthesize an nth level in total m levels is equal to 4 - 1 1

.

Since the algorithm computes 2-D IDWT's from the coarsest level

to the finest level in each of the subblock units. Thus, the total storage requirement between levels is

C::'_;L~JM-~~

.

In order to reduce this large storage requirement, we again break down the larger subblock into only one input datum for each level. Based on this idea, the total storage requirement becomes 4(m-1). Finally, the

(3)

begin { 2-D synthesis PA) input x(l..N, 1 . m ; for (p = 1 to N /4lo@")

do RQTPA (log(N)); end { 2-D synthesis PA) RQTPA( m)

begin [recursive quaternary-tree PA) if (m>O) then

[FeLd four subbands ( HHiii, HGiii, GHiii, GGiii) with one input datum into a 2-D filter to synthesize four data of

HH')I-', as shown in Fig. 11; RQTPA (m- 1);

RQTPA (m- 1); RQTPA (m-1); RQTPA (m- 1);

end {recursive quaternary-Uee'PA)

From the above algorithm, we can expand it as a quaternary tree, as shown in Fig. 3. Root of the tree corresponds to the coarsest level of 2D IDWT's, each leaf node represents a finest level, and the remaining levels are viewed as the internal nodes of the tree. In order to visit this tree, the pre-order traversal is used because it meets the feature of synthesis DWT. For example, if the coarsest level is three, the traversal sequence for the proposed algorithm becomes as 3-2- 1 11 1-2- 11 11 -2- 11 11-2-11 11, where each number stands for a resolution level. Note that each level, except that the coarsest level contains four subbands and one input datum, has only three subbands and one input datum, i.e., the low-high (HG), high-low (GH), and high-high (GG) subbands. Another subband of input data, the low-low

(HH),

comes ftom the filtering output of a previous coarser level.

For processing an N x N image, this algorithm needs N Z clock cycles to compute rn levels of 2-D IDWT's. Since the first output appears only when all the levels of filterings have been computed, the latency takes 3m+4 clock cycles. The memory space requires 2NL+4(m-l), where L is the filter length and 2NL is the size of delay line used between row and column filters. Clearly, these performance data reveal that the proposed algorithm can be applied in real-time signal processing. In summary, the algorithm has the following features:

(1) Fast computation time.

(2) Low requirement of memory space. (3) Low latency.

(4) Suitable for real-time signal processing due to features (1) and (3). (5) Suitable for single-chip VLSI implementation due to feature (2).

(4)

3. PERFORMANCE EVALUATION

The performance comparison between our proposed algorithm and the classical pyramid algorithm is summarized in Table 1. This comparison is performed using the same row and column filter structures. The classical pyramid algorithm requires large latency, long computation time, and big memory space, which are all proportional to N Z for computing an N x N image. On the other hand, our proposed algorithm requires a low latency of 3rn-14, around N Z computation clock cycles, and 2NL+4(m-l) memory space for rn levels of 2-D IDWT computations, each with L filter length.

4.

CONCLUSION

A novel and efficient pyramid algorithm for the 2-D inverse DWT ha5 been formulated in this paper. This proposed algorithm overcomes some defects of the classical PA, thus it is suited for scheduling a fast 2-D IDWT computation and providing a low-cost hardware implementation. Based on this algorithm, we will implement a real-time single-chip for DSP processing in the future.

5. ACKNOWLEDGMENTS

This work was supported by the National Science Council, ROC, under Grant

NSC 88-2215-EOO2-037.

Reference

[ l ] S . Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Trans. Parrern Anal. and Machine Intell., vol. 1 1 , no. 7, pp. 674-693, July 1989.

[2] 0. Rioul and M. Vetterli, “Wavelets and signal processing,” IEEE Signal Processing Magazine, vol. 8,110.4, pp. 14-38, Oct. 1991.

[31 I. Daubechies, Ten Lectures on Wavelets, vol. 61 of CBMS-NSF Regional Conferences Series in Applied Marhematics, SIAM, Philadelphia, PA, 1992. [4] S . Mallat, “Multifrequency channel decompositions of images and

wavelet models,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, no. 12, pp. 2091-2110, Dec. 1989.

[5] M. Vishwanath, “The recursive pyramid algorithm for the discrete wavelet transform,” IEEE Trans. Signal Processing, vo1.42, no. 3, pp. 673-676, MU. 1994.

(5)

[6] C. Chakrabarti and M. Vishwanath, “Efficient realizations of the discrete and continuous wavelet transforms: from single chip implementations to

mappings on SIMD array computers,” IEEE Trans. Signal Processing, vol. 43, no. 3, pp. 759-771, Mar. 1995.

HG’

[7] R.M. Owens and M. Vishwanath, “A very efficient storage structure for DWT and IDWT filters,” Journal of VLSI Signal Processing, vol. 19, no.3, pp.215-225, Aug. 1998.

GG’

(6)

Fig. 3 Three levels of quaternary tree.

Algonlhms Latency Period

PA N 2 QN2

RQTPA

3 r n 4 N 2

Memory Space Control Unit

N 2 simple

數據

Fig. 1 One level of 2-D IDWT.
Fig. 3 Three levels of quaternary  tree.

參考文獻

相關文件

In order to provide some materials for this research the present paper offers a morecomprehensive collection and systematic arrangement of the Lotus Sūtra and its commentaries

In Section 3, we propose a GPU-accelerated discrete particle swarm optimization (DPSO) algorithm to find the optimal designs over irregular experimental regions in terms of the

In particular, we present a linear-time algorithm for the k-tuple total domination problem for graphs in which each block is a clique, a cycle or a complete bipartite graph,

Breu and Kirk- patrick [35] (see [4]) improved this by giving O(nm 2 )-time algorithms for the domination and the total domination problems and an O(n 2.376 )-time algorithm for

In this paper, we build a new class of neural networks based on the smoothing method for NCP introduced by Haddou and Maheux [18] using some family F of smoothing functions.

In summary, the main contribution of this paper is to propose a new family of smoothing functions and correct a flaw in an algorithm studied in [13], which is used to guarantee

For the proposed algorithm, we establish a global convergence estimate in terms of the objective value, and moreover present a dual application to the standard SCLP, which leads to

We give some numerical results to illustrate that the first pass of Algorithm RRLU(r) fails but the second pass succeeds in revealing the nearly rank