Two-variable modularized fast polynomial transform algorithm for2-D discrete Fourier transforms

(1)

84 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 2, NO. 1, MARCH 1992

Express Letters

Two-Variable Modularized Fast Polynomial

Transform Algorithm for 2-D

Discrete Fourier

Transforms

Ja-Ling Wu and

Y.

M. Huang

Abstract-In this paper, a novel two-variable modularized fast poly- nomial transform (FPT) algorithm is presented. In this new method, only fast polynomial transforms and fast Fourier transforms of the same length are required. The modularity, regularity, and easy extensi- bility of the proposed algorithm make it of great practical value in computing multidimensional discrete Fourier transforms (DIT’S).

1. INTRODUCTION

Multidimensional (Multi-D) discrete Fourier transform (DFT) is an indispensable tool for Multi-D digital signal processing, particu- larly in the processing of video and computer tomographic signals. Moreover, Multi-D Fourier transforms arise not only naturally from problems that are intrinsically multi-D but also artificially as a way

of computing a long length 1-D transform [ 11.

Nussbaumer and Quandalle [2] showed that the Multi-D DFT can be converted into a number of 1-D DFT’s by using the polynomial representation of extension fields, the so-called fast polynomial transform (FPT). This transform is very efficient because it is accomplished using ordinary arithmetic without multiplication. However, because all the operations of the FPT are defined in the distinct cyclotomic factors of the polynomial ZN

-

1, one has to

use FFT’s and FPT’s with different lengths to compute the 2-D (or

Multi-D) DFT [3].

Truong et al. [4] have recently proposed a modularized FPT

structure for computing 2-D cyclic convolutions. In this new method, the 1-D cyclic polynomial convolution is decomposed into several

cyclic convolution of polynomials, all of the same length. Wu and

Huang [5] have generalized the modularized FPT algorithm to deal with the 2-D digital signal processing tasks more recently. The regularity of these new algorithms simplifies the control and implementation of multiprocessor and/or VLSI FPT.

In this paper, a novel two-variable modularized FPT algorithm, based on the idea of 141 and the factorization of multivariable polynomial ring suggested by Pitas and Strintzis [6], for computing 2-D DFT is presented. The modularity, regularity, and easy extensibility of the proposed algorithm make it of great practical value in computing multidimensional DFT’s.

II.

TWO-VARIABLE POLYNOMIAL REPRFSENTATION OF

Without loss of generality, let us consider the 2-D DFT of the 2-D DFT’s

NI x N2 complex sequence Xn l,n z defined by NI-1 N z - 1

n , = O n 2 = 0

-

x k l , k , =

C C

Xn,,n2W.:klW;;ZZkZ (1) Manuscript received May 28, 1991; revised November 27, 1991. The paper was recommended by Associate Editor Yrjo Neuvo.

The authors are with the Department of Computer Science and Informa- tion Engineering, National Taiwan University, Taipei, 10764, Taiwan, Republic of China.

IEEE Log Number 9106097.

where

4

= 2“‘) and WN, = e-j2*INi, for i = 1 , 2 , ml 5 m2 and

j =

a.

Since the periodic sequence obtained by sampling the Z

transform at N equally spaced points on the unit circle is identical to the DFT coefficients of the period sequence, (1) can be reformulated as

Zkl,

k ,

=

X(

Y,

z)

mod ( Y -

w , ; ) ,

( Z -

w , ; )

(2) with N I - ] N z - 1 X ( Y , Z )

=

- I ) , n l = O n z = O ( z N 2 - 1). (3)

Thus, the computation of (1) depends on the computation of the polynomial substitution presented by (2) and consequently on the

polynomial reduction defined by (3). Now, if the polynomials

( YNl

-

1) and (ZNz

-

1) are factorized into coprime factors, then with the aid of the Chinese remainder theorem (CRT) of polynomials [7, p. 611, a parallel/concurrent 2-D DFT algorithm can be obtained.

Pitas and Strintzis [6] have presented an Multi-D cyclic convolution algorithm, based on the factorization of multivariable polyno-

mial rings, which achieves the theoretically minimum number of

multiplications. However, the shortcomings of length variation still exists in this approach. For VLSI or hardware parallel implementation of an algorithm, reduction of the number of multipliers is just one of the important factors that need to be considered. Regularity and modularity in the computation structure may even play a more important role in the practical considerations. This is what the modularized FPT algorithms worked.

In Section

III,

we integrated the techniques presented in [4] and [6] to derive a novel two-variable modularized FPT algorithm for computing 2-D DFT’s. Since the algorithms for Multi-D Fourier transforms are studied in the easy way by studying the case of the 2-D Fourier transform as a prototype the discussion and formulas for the proposed 2-D algorithm can be extended directly to computing Multi-D DFT’s.

m.

TWO-VARIABLE FPT ALGORITHM FOR COMPUTING

2-D DFT’s

Before deriving the proposed algorithm, an important theorem Theorem 1: Let d ( x ) be a multiple of g( x ) . Then for any a ( x ) ,

about the computation of a residue polynomial is given first.

a( x ) mod g( x )

=

( a ( x ) mod d ( x ) ) mod g( x ) where a( x ) , d( x ) , and g( x ) are polynomials.

The proof of Theorem 1 can be found in [7, p. 511.

Now define x , ( Y , z ) = X ( Y , Z ) m o d ( Y N l / 2 + 1 ) , ( z N z - - 1) (4a) X 2 ( Y . Z ) = X ( Y , Z ) m o d ( Y N 1 / z - l ) , ( Z N z - 1). (4b) Since YNl’*

+

1 =

n

_{( Y}_-

_{w i ; )}

k,:odd 1051-8215/92$03.00 0 1992 IEEE

(2)

and Y N I P - 1 =

n

( y - w.;) (5b) k , :even it follows that - Xk,,k2 Xi( y , z ) m o d ( y - w:;),

( z

-

w:;)

( 6 )

where i = 1 and 2 for odd- and even-numbered k , , respectively.

Based on the ideas given in [4] and [ 6 ] , the polynomial ( ZN2 - 1) can be factorized, in the quotient field C( Y ) / YNl/'

+

1, as follows: 3 k=O

)

z N 2 - 1 =

n

( z N 2 / 4 - y k N 1 / 4 + 1) = ( z N 2 /4 - 1)

(

Z N Z /4 - y N 1 '4)

(

Z N Z 14

.

( Z N 2 1 4 + y N 1 1 4 ) . (7)

It is easy to verify that Theorem 2 holds. Theorem 2: i)

z

-

w,$

ii)

z

- W$ iii)

z

-

w:;

z -

w:;

z -

w

:

;

4 q + 1 4 9

+

3 4 q + 1 4 9

+

3 iv) Z -

w:;

ZN214 - 1, for k, = 4 9 ZN2/4

+

1, for k , = 4 9

+

2

ZN2I4 - Y y l 4 , for k , = 4 p

+

1 and k, =

Z N 2 / 4

+

Y N 1 I 4 , for k , = 4 p

+

1 and k, =

ZNZi4

+

Y N 1 / 4 , for k , = 4p

+

3 and k, =

ZN2I4 - YNIl4, for k , = 4p

+

3 and k, =

where O s p s ( N , / 4 ) - 1, O r q s ( N , / 4 ) - 1, and a16

means that a is divisible by 6,

From the different forms of k , and k,, there are six types of polynomial reductions need to be derived. However, for simplicity only three of them are detailed in the following and the others can be obtained similarly.

(CASE l ) k , = 4p

+

1, k, = 4 9

+

1 or

k , = 4p

+

3, k, = 4 9

+

3 In this case,

z

-

w:;

I

~ ~ - Y ~ I / ~ , 2 / the polynomial ~

XI( Y,

Z)

can be reduced modulo ( YN1/,

+

l ) ( Z N z / 4 - YNl 1 4 ) to become

Since k, is odd in this case, it follows that, by using the substitution of Y = Z N 2 k 1 / N 1 k 2 , ( 8 ) can be reformulated into the following three polynomial equations:

.

mod

(

ZN2/4

+

j ) _{( 9 4}

Now let k , = 2 s

+

1 , O 5 S 5 (NI /2) - 1, then

(NI 12)- 1

Equation (10) can be viewed as an odd-frequency polynomial trans-

form and it can be computed by an efficient radix-2 FFT-like

algorithm just like the odd-frequency DFT [8].

(CASE2)k, = 4 p

+

1 , k, = 4 9

+

1 or

k , = 4 p

+

3 , k, = 4 9

+

2. In this case,

z

-

w:;

I

~ ~- 1, the polynomial 2 / ~ x , ( Y ,

Z)

can be reduced modulo ( y N l I Z

+

i)(zN2/4 - 1) to become

x ~ ( Y , Z ) = x , ( Y , z ) m o d ( Y N I / z

+

1 ) , ( ~ ~ - 21

1

/ ~

then

Similar to case 1, the following polynomial equations are obtained.

(NI /2)- 1

X p , ( k 2 + l ) k l ( Z )

Xp,

n , ( z ) ( z N z / N l ) " l k l

n , = O

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 2, NO. 1, MARCH 1992 85

presented in which only fast polynomial transforms, fast Fourier

Equation (12a) defines the (NI /2)-point polynomial transform in

the polynomial field modulo ( Z N 2 / 4

+

j ) . FFT-like algorithms can also be applied to the computation of (12a). And (12.3) is actually an (N, /4)-point odd frequency DFT.

(CASE3) k , iseven Let Y2( Y ,

z )

=

'E

2

X 2 , n l , n 2 y n 1 Z n 2 n , = O n 2 = 0 then (NI /2)- 1 Nq- 1 * m o d ( Y N l / 2 + 1 ) , ( z N 2 - 1 ) (14.1)

With the aids of (7) and Theorem 2, and following the same

derivations given case 1 and case 2 one can once again compute

y k l , k 2 ,

for k , even, very efficiently. The flowchart of this new algorithm is shown in Fig. 1. It is obvious from Fig. 1 that for the

computation of an NI X N2 2-D DFT, only FPT's of length NI /2

and FFT's of length N2 /4 are required.

IV.

CONCLUSIONS AND DISCUSSIONS

In this paper, a novel two-variable modularized FPT algorithm is

(3)

86

mod mod mod

yNfl + 1, zw4-1 yNJ2 + 1, zW4-YNd4 + 1, ZW4+l

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 2, NO. 1, MARCH 1992

mod

Pa

+ 1, zNJ4+91'4

+

odd-frequency P.T.

length: NI R length: NI R length: NI R length: NI /z

t

(4)

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 2, NO. 1, MARCH 1992 87

transforms of the same length are required. Consequently the modularity and regularity of the proposed algorithm meets the desirable properties of VLSI/parallel implementation of 2-D DFT’s.

As compared with the modularized single-variable FPT algorithm given in [ 5 , Section 51, the newly proposed algorithm has the advantage of better extensibility, that is, the latter can be applied for computing Multi-D DFT’s directly. In other words, the proposed algorithm can be viewed as a powerful tool for computing multi-D DFT’s.

REFERENCES

S. Winograd, “On computing the discrete Fourier transform,” Proc. Nut. Acad. Sci. U.S.A., vol. 73, pp. 1005-1006, 1976.

H. J. Nussbaumer and P. Quandalle, “Fast computation of discrete

Fourier transform using polynomial transforms,” ZEEE Trans.

Acoust., Speech, Signal Processing, vol. ASSP-27, pp. 169-181, April 1979.

H. J. Nussbaumer, “New polynomial transform algorithms for multidimensional DFTs and convolutions,” ZEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp. 74-84, Feb. 1981. T. K. Truong et al., “An FPT algorithm with a modularized structure of computing 2-D cyclic convolutions,” ZEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-36, pp. 1540-1542, Sept. 1988.

J.-L. Wu and Y. M. Huang, “Modularized fast polynomial transform algorithms for two-dimensional digital signal processing, ” Proc.

I. Pitas and M. G. Strintzis, “Multidimensional cyclic convolution algorithms with minimal multiplicative complexity,” ZEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 384-390, March 1987.

R. E. Blahut, Fast Algorithms for Digital Signal Processing.

Reading, MA: Addison-Wesley, 1985.

G. Bonnerot and M. Bellanger, “Odd-time odd-frequency discrete Fourier transform for symmetric real-valued series,” Proc. ZEEE, IEE, pt. F, vol. 137, pp. 253-261, Aug. 1990.

vol. 64, pp. 392-393, 1976.

High-Order Entropy Coding for Images

KOU-HU TZOU

I. INTRODUCTION

In source coding, the statistical characteristics have been widely utilized to reduce the average bit rate for representing the source. For sources with independent samples, its statistical nature can be fully exploited on a sample-by-sample basis. However, for corre- lated sources, the information of high-order statistics can help to improve the coding efficiency. A form of the higher order statistics is the nth-order conditional probability. In this notion of high-order statistical coding, the optimal word length to code a sample x i is -log, p ( x i / t i ) where p ( x i / t i ) is the nth-order conditional probability and ti is the conditioning state defined as t i = ( x i -

,,

x i - 2 , *

.

,

x i - J. For an nth-order Markov source, the sta-

tistical nature can be fully exploited using this nth-order statistics. *Manuscript received August 1, 1991; revised October 18, 1991. This letter is based on the paper presented at the IEEE Visual Signal Processing and Communication Workshop, June 6-7, 1991, Hsinchu, Taiwan, Republic

of China. This paper was recommended by Associate Editor Yasuhiko Yasuda.

The author was with Bellcore, Red Bank, NJ 07701. He is currently with COMSAT Laboratories, Clarksburg, MD 20871.

IEEE Log Number 9106399.

Nevertheless, the use of high-order statistics may become intractable due to the large number of states.

The coding efficiency of using high-order statistics has been demonstrated in Langdon and Rissanen’s work on black-white docu- ment (binary) compression [l]. In [l], the conditioning state was defined as the combination of 10 pixels in the causal region and a total number of 1024 states were generated from all possible combi- nations of the 10 pixels. The associated coding technique was arithmetic coding, which allows the use of fractional bit to code a sample. The drawback of this high-order conditional probability approach is the requirement of a large memory to store all the measured conditional probabilities. When this approach is extended to gray-level images or cases that utilize even higher order statistics, the memory requirement becomes formidable. In this letter, we use the incremental tree extension method to overcome this problem.

II. INCREMENTAL TREE EXTENSION FOR 2-D DATA

In Langdon and Rissanen’s approach, the 10-pixel states can be represented as the leaves of a 10-bit full binary tree in Fig. 1, where each conditioning pixel is associated with all the branches in a layer of the tree to direct which branch to follow according to the pixel value. Each combination of the 10 pixels corresponds to a specific path traversed from the root to the leaf. Due to characteristics of the image, some states are rarely reached and these states can be pruned from the conditioning tree with a minimum drop in efficiency. However, the pruning approach may not be practical when the depth of the tree becomes very large.

An incremental tree extension method was introduced in 121, [3] to extend the conditioning tree from the root until a desired number of leaves was obtained. The incremental tree extension method is

exemplified in Fig. 2 for M-ary sources. The design procedure

starts with a tree with M initial leaves, and all the M leaves are considered as candidates for extension. The entropy (H( X / t , ) ) associated with the leaf t , is defined as:

M - 1

H ( X / t , ) = -

C

P ( x , = j / t J log P ( X , = j / t J

J = o

where p ( x , = j / t , ) is the conditional probability for sample x ,

having the value j . If the leaf t , is extended to t r k r k = 0; * * ,

M - 1, the corresponding entropy H (

X /

t r k ) can be calculated according to the above equation. The amount of reduction in entropy by extending the leaf t , is

M - 1

A H ( X / t ~ ) = N t , H ( X / t , ) - k = O N t , k H ( X / t , k )

where NI, is the number of occurrences of conditioning state t , and

N,,,

is the number of occurrences of conditioning state t , k . It can be shown that A H ( X / t , ) is always greater than or equal to zero. The strategy to select a leaf for extension is to choose the one that results in the maximum entropy reduction. When a leaf is chosen for

extension, M new leaves are extended from the chosen leaf and the

tree extension process iterates until the desired number of leaves is obtained.

In [2], [3], the high-order conditioning tree is applied to discrete cosine transform (DCT) data, where the 2-D transform data were converted into a 1-D data through a zigzag scan, and the conditional entropy coding essentially dealt with a 1-D signal. In the current approach, we exploit the 2-D correlation by using a 2-D pixel pattern as the conditioning states. In order to meet the causal requirement, only the data that are currently available to the receiv-

1051-8215/92$03.00 0 1992 IEEE