Fast Algorithm and Common Structure Design of Recursive Analysis and Synthesis Quadrature Mirror Filterbanks for Digital Radio Mondiale

(1)

An-Kai Li, Sheau-Fang Lei, Wen-Kai Tsai^*, and Shin-Chi Lai^#

Dept. of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan

*Information and Communications Research Laboratories, Industrial Technology Research Institute, Hsinch, Taiwan

#Dept. of Computer Science and Information Engineering, Nan Hua University, Chiayi, Taiwan ([email protected]) Abstract—This paper proposed a novel fast algorithm and

common structure design of analysis and synthesis quadrature mirror filterbanks (AQMF, SQMF) on the spectral band replication (SBR) in digital radio mondiale (DRM). Based on recent Lai et al.’s concept, an extended issue is addressed from the view point of recursively computing the AQMF and SQMF coefficients. The proposed method also combines with the lifting scheme algorithm and canonical signed digit (CSD) multiplication.

The results show that the proposed AQMF algorithm has a great improvement on computational complexity. For the recursive kernel computation (N=64), the proposed method has, respectively, 46.38% of multiplication reductions and 20.46% of addition reductions which can cover the shortcoming of the proposed SQMF. The overall complexity of the proposed algorithm (N=64) requires 1984 real multiplication and 128 CSD multiplication, 4704 real addition and 192 CSD addition, and 113 coefficients. It would be more efficient and more suitable than previous works for DRM applications.

Index Terms—digital radio mondiale (DRM); quadrature mirror filterbanks (QMF); recursive structure

I. INTRODUCTION

igtal Radio Mondaile (DRM) offers numerous audio codecs, such as advanced audio coding (AAC) and Code-excited linear prediction (CELP). In AAC, a new spectral band replication (SBR) tool was employed to enhance audio quality by using Quadrature Mirror Filterbanks; however, it also caused a great number of computations.

Huang et al. [2] proposed a fast decomposition of QMF for low-power SBR (LP-SBR) by using DCT-II and DCT-III algorithms. In the other hand, a complex QMF was used to avoid the aliasing effect for high-quality SBR (HQ-SBR). Hsu et al. [3] suggested DCT-IV and FFT-based fast algorithms for calculating four types of QMF definitions. Recently, Huang et al. [4] targeted DCT-IV-based decompositions on complex analysis QMF (AQMF) and synthesis (SQMF). Another study [5], Chivukula et al., focused on fast algorithm development in Advanced Audio Coding-Enhanced Low Delay (AAC-ELD), and converted the kernel of DCT-IV into that of DCT-II;

however, the definitions of QMFs in high-efficiency AAC (HE-AAC) and AAC-ELD are different so that the fast algorithm [5] cannot be applied to the DRM standard.

To solve the problem, Lai et al. [6] proposed two new fast computations, i.e. split-radix FFT-based and recursive DFT-based algorithms, for AQMF and SQMF with lifting

This work was supported in part by the National Science Council, Taiwan under Grant No. NSC 101-2221-E-006-275 and NSC 101-2221-E006-271.

scheme (LS) [7]. Considering the hardware implementation, resource sharing, and system integration for previous related DRM issues [8-11], an RDFT-based QMF [6] was suggested because of its regular and flexible properties. Based on Lai et al.’s approach, an extended issue is addressed form the view point of recursively computing the AQMF and SQMF coefficients in this work. To reduce the computational complexity, the canonical signed digit (CSD) multiplication [12, 13] and LS multiplication [7] are both required in the proposed recursive algorithm.

The rest of this paper is organized as follows: Section II makes a detailed algorithm derivation of AQMF and SQMF.

Then, the relatively common structure design of algorithm is given. Section III compares the proposed design with other approaches in terms of various performance indices. Finally, conclusions will be outlined in Section IV.

II. P^ROPOSEDF^ASTA^LGORITHMD^ERIVATION A. Derivation of the Proposed Fast Algorithm for AQMF

The definition of N-point complex AQMF computation is shown in (1), where x[n] is the time-domain real input sequence, X[k] is the frequency-domain coefficients. In DRM specifications, the long and short transform lengths of N are, respectively, set to 64 and 32.

( 0.5)(2 0.25) 1

0

[ ] [ ] , 0 1

2

j k n

N

N n

X k x n e^π k N

+ −

−

=

∑

× ≤ ≤ − ⁽¹⁾

By separating x[n] into four sections and adjusting the range of index n to tobe counted from 0 to N/4-1, equation (1) can be rewritten as (2), where A[k], shown in (3), can be viewed as the post multiplication computation of X[k]. Then, X1[k], X2[k], X3[k], and X4[k] are expressed in (5).

( 0.5) 1 (2 1)

4

0

[ ] [ ]

j k N j k n

N N

n

X k e x n e

π π

− + − +

=

= × ∑ ×

4

1

[ ]

_m

[ ]

m

A k X k

=

= × ∑

(2)

( 0.5)

[ ]

4

j k

A k e

N π

− +

=

(3)

4

1

[ ] [ ]

R m

m

X k X k

=

= ∑

(4)

Fast Algorithm and Common Structure Design of Recursive Analysis and Synthesis Quadrature

Mirror Filterbanks for Digital Radio Mondiale

D

(2)

( )

(2 1) /4 1

1

0

(2 1) /4 1

2 4

0

(2 1) /4 1

3

0

(2 1) /4 1

4 4

0

[ ] [ ]

4 [ ] ( 1) [ ]

2 [ ] ( ) [ 3 ] .

4

j k n

N

N n

j k n

N j

k N

n

j k n

N k N

n

j k n

N j

k N

n

X k x n e

X k j e x n N e

X k j x n N e

X k j je x n N e

π

π π

π

π π

− +

=

− +

=

− +

=

− +

=

= ×

⎛ ⎞

= ⎜ ⎟ + ×

⎝ ⎠

= − + ×

⎛ ⎞

= ⎜ − ⎟ + ×

⎝ ⎠

∑

(5)

According to (5), equation (2) can be rewritten as shown in (6), where xa[n] is defined as (7). In (6), input sequence x[n] is first passed through the previous processing procedure, expressed in (7), and then the new input xa[n] goes through the complex computing procedure, which can be implemented by recursively computing method, discussed in section C.

2 ( 0.5) /4 1

0

[ ] [ ] [ ]

j k n

N a N n

X k A k x n e

− π +

=

= × ∑ ×

(6)

[ ] [ ]

4

[ ] ( 1) [ ]

4 2

j

k k

a

N N

x n = x n + j e x n

^π

+ + − jx n +

4

3 ( 1) [ ],0 1.

4 4

j

k

N N

je x n n

π

+ − + ≤ ≤ −

⁽⁷⁾

B. Derivation of the Proposed Fast Algorithm for SQMF The definition of N-point complex SQMF computation is shown in (9), where X[k] is the frequency-domain complex input sequence, and x[n] is the time-domain sequence processed through the recovering process with the range of index n from 0 to N-1. By taking out the item of e⁻^{j k}²⁽ ⁺^0.5⁾^π^N from (8), it can be expressed as (9).

( 0.5)(2 2 1) / 2 1

0

ˆ[ ] [ ]

j k n N

N

N k

x n Re X k e

π + − +

−

=

⎧ ⎫

⎪ ⎪

= ⎨ × ⎬

⎪ ⎪

⎩

∑

⎭ (8)

( )

^{/ 2 1} ^{2 (} ^0.5)( ^0.5)

0

ˆ[ ] 1 [ ] .

j k n

N

N k

x n Re X k e

π + +

−

=

⎧ ⎫

⎪ ⎪

= ⎨ − × × ⎬

⎪ ⎪

⎩

∑

⎭ ⁽⁹⁾

Dividingx nˆ[ ]into four sections and adjusting the range of index n to be from 0 to N/4-1, equation (9) can be expressed as (10), where xs1[n], xs2n], xs3[n], and xs4[n] are individually defined in (11).

1 2 3 4

ˆ[ ] [ [ ]

_s _s

[ ]

_s

[ ]

_s

[ ]]

x n = x n x n x n x n

(10)

2 ( 0.5)( 0.5) /2 1

1

0

[ ] [ ]

j k n

N s N

k

x n Re X k e

π + +

−

=

⎧ ⎫

= − ⎨ ⋅ ⎬

⎩ ∑ ⎭

2 ( 0.5)( 0.5) /2 1

2 4

0

[ ] [ ]

j k n

j N

k N

s

k

x n Re j e

^π ⁻

X k e

^π ⁺ ⁺

=

⎧ ⎫

= − ⎨ ⋅ ⎬

⎩ ∑ ⎭

2 ( 0.5)( 0.5) /2 1

3

0

[ ] ( 1) [ ]

j k n

kN N

s

k

x n Re j X k e

^{π +}

− +

=

⎧ ⎫

= − ⎨ − ⋅ ⎬

⎩ ∑ ⎭

2 ( 0.5)( 0.5) /2 1

4 4

0

[ ] ( j) [ ] .

j k n

j N

k N

s

k

x n Re e

^π ⁻

X k e

^π ⁺ ⁺

=

⎧ ⎫

= − ⎨ − − ⋅ ⎬

⎩ ∑ ⎭

(11) According to the remainder of dividing the k by 4, X[k] is separated into four parts and xs1[n] can be rewritten as the sum of four sequences expressed in (12) and (13).

{ }

1

[ ]

0

[ ]

1

[ ]

2

[ ]

3

[ ]

s k k k k

x n = − Re x n + x n + x n + x n

⁽¹²⁾

2 (4 )( 0.5) /8 1

0 0

0

2 (4 )( 0.5) /8 1

1 1

0

2 (4 )( 0.5) /8 1

2 2

0

2 (4 )( 0.5) /8 1

3 3

0

[ ] [ ] [4 ]

[ ] [ ] [4 1]

[ ] [ ] [4 2]

[ ] [ ] [4 3] .

j k n

N k N

k

j k n

N k N

k

j k n

N k N

k

j k n

N k N

k

x n s n X k e

π

− +

=

− +

=

− +

=

− +

=

= × ×

= × + ×

∑

(13)

Note that each of s0[n], s1[n], s2[n], and s3[n] has relatively exponential function. As expressed in (6), equation (13) means that input sequence goes through the complex computing procedure implemented with recursive method discussed in next section, and then is multiplied by a post multiplication such as s0[n], s1[n], s2[n], and s3[n].

After taking the same procedure described above, the other sections, xs2[n], xs3[n], and xs4[n], of ˆ[ ]x n can be expressed as (14), where the constant Ca is exp(jπ/4).

{ }

2

[ ]

0

[ ]

1

[ ]

2

[ ]

3

[ ]

s k k k k

x n = − Re jx n − jx n + jx n − jx n

( )

{ }

3

[ ]

0

[ ]

1

[ ]

2

[ ]

3

[ ]

s a k k k k

x n = − Re c x n + jx n − x n − jx n

( )

{ }

4

[ ]

0

[ ]

1

[ ]

2

[ ]

3

[ ] .

s a k k k k

x n = − Re c x n + jx n − x n − jx n

(14) C. Derivation of the Proposed Recursive Kernel Computation For the most complicated kernel computation both in AQMF and SQMF, a fast recursive algorithm is employed to obtain a common structure with less iteration. By the concept of Goertzel algorithm, equation (6) can be viewed as taking the convolution of the xa[N/4-1-p] and exponential coefficient first, and then multiplying the convolution result of XR[k] by A[k].

The detail is shown in (15) and (16).

[ ] [ ]

_R

[ ]

X k = A k × X k

(15)

2 ( 0.5)

[ ] [ / 4 1 ]

j k n

R a N

X k x N n e

⎛

π +

⎞

= ⎜ − − ⊗ ⎟

⎝ ⎠

⁽¹⁶⁾

Taking the Z-transform of (16), it can be written as (17). After expanding (17) and then taking the inverse Z-transform, the difference equation of xa[N/4-1-p] and XR[k] will be gotten.

Furthermore, the difference equation can be transformed to recursive structure as depicted in Fig.1.

(3)

2 ( 0.5)

1

1 2

( ) 1

( ) ( ) 1 2 ( 2 ( 0.5) ) .

j k

N

R A

a

X z e z

H z X z cos k z z

N

π

+ −

− −

− ⋅

= = − + ⋅ +

(17)

By using the same procedure described above, (13) can be computed by Z-transform and inverse Z-transform to get the difference equation for the proposed SQMF algorithm. Because the four equations in (13) have the same exponential coefficient, exp( 2 4 0.5 / ), they have the same recursive system function HS(z), expressed in (18). According to (18), the recursive kernel structure, depicted in Fig.2, can be accomplished.

2 (4 2) 1

1 2

( ) 1

2 (4 2)

1 2 ( ) .

j n

N S

e z

H z n

cos z z

N

π

+ −

− −

− ⋅

= − + ⋅ +

(18)

As shown in Fig.1 and Fig.2, by just changing the cosine and exponential coefficients, the recursive kernel structure can be used in common.

Fig.1 Recursive Kernel Structure of AQMF

Fig.2 Recursive Kernel Structure of SQMF III. DISCUSSION AND COMPARISON

In this section, the performance metrics for computational complexity, coefficient requirements, computational period and PSNR between the proposed algorithm and previous approaches are compared and addressed. Table I presents a summary of the analytic results of the computational

complexity and coefficient requirements among original computation (i.e., the direct matrix computation), the algorithm by Lai et al. [6] and the algorithm we proposed. The results show that the proposed AQMF algorithm using recursive kernel requires 1216 multiplication, 3228 addition, and 64 coefficient operations when N is set to 64. Note that the coefficient in preprocedure is a constant which can be easily realized by using the CSD technique. The recursive procedure requires N²/4+3N/2 multiplication, 3N²/4+5N/2 addition, and N/2+1 coefficient operations. In the postprocedure, it requires 3N/2 multiplication, 3N/2 addition, and N/2 coefficient operations. Compared with the original AQMF computation, the proposed AQMF algorithm has an improvement in reducing multiplications by 70.31%, additions by 17.46%, and coefficients by 98.41% when N is set to 64. The proposed AQMF algorithm, compared with [6], has an improvement in reducing multiplications by 46.38%, additions by 20.46%, and coefficients by 65.79% when N is set to 64.

Table I also shows the comparison of computation complexity of SQMF among original method, Lai et al. [6], and the proposed SQMF algorithm. The proposed algorithm requires 768 real multiplication, 1376 real addition, and 48 coefficient operations. The proposed algorithm requires no computation in the prepocedure because no adjustments are required to the input. In the recursive procedure, it takes N²/8+N real multiplication, N²/4+5N/2 real addition, and N/4 coefficient operations. And it needs 3N real multiplication, 3N real addition, and N/2 coefficient operations in the postprocedure. The overall proposed QMF (AQMF + SQMF) algorithm has significant improvement compared with the original computation and [6]. Compared with the original computation, the proposed QMF has an improvement in real multiplication by 75.78%, real addition by 41.67%, and coefficient by 98.62. Compared with [6], the proposed has an improvement in real multiplication by 32.43%, real addition by 17.24%, and coefficient by 52.12%.

Table II gives a simple analytic result of recursive cycles between Lai et al. [6] and the proposed algorithm. The numbers of recursive cycles of the proposed AQMF and SQMF algorithm are proportional to N²/8 and N²/16 respectively.

However, in [6], the numbers of recursive cycles of AQMF and SQMF are proportional to N²/4 and N²/8 respectively. As result, the proposed algorithm can be operated in real time with a lower clock rate. Fig. 3 shows the PSNR simulation results for the proposed algorithm with different coefficient word lengths and internal word lengths under 10⁶ random input test patterns.

The input and output word lengths are both set to 16 bit, and the PSNR value of the proposed algorithm can achieve over 65 dB under the constrains of 24-bit coefficients and 24-bit internal word length. This result also shows the optimal conditions of hardware design in future implementation.

IV. C^ONCLUSION

In this paper, a novel algorithm and structure design is proposed for QMFs. The complexity of algorithm only requires more multiplications and additions. Additionally, the total computational cycle of the proposed algorithm is less than recent Lai et al. method. These results would be very useful and powerful for future SBR applications in DRM system.

(4)

TABLE I

ANALYTIC RESULTS OF COMPUTATION AND COEFFICIENT REQUIREMENT FOR VARIOUS QMFs (N=64 AND N=32)

Filterbank Original

[6] Proposed Pre. Recursive

DFT Post. Total Pre. Recursive

Kernel Post. Total

A Q M F

Mpy N=64 4096 192 1980 96 2268 64CSD 1120 96 1216 + 64CSD

N=32 1024 96 476 48 620 32CSD 304 48 352 + 32CSD

Add N=64 4032 192 3896 96 4184 64_CSD 3232 96 3328 + 64_CSD N=32 992 96 920 48 1064 32CSD 848 48 896 + 32CSD

Coeff N=64 4096 128 30 32 190 1CSD 33 32 65 + 1CSD

N=32 1024 64 14 16 94 1CSD 17 16 33 + 1CSD

S Q M F

Mpy N=64 4096 96 476 96 668 0 576 192 + 128_CSD 768 + 128_CSD N=32 1024 48 108 48 204 0 160 96 + 64CSD 256 + 64CSD

Add N=64 4032 160 1244 96 1500 0 1184 192 + 128CSD 1376 + 128CSD

N=32 992 80 364 48 482 0 336 96 + 64CSD 432 + 64CSD

Coef N=64 4096 32 14 0 46 0 16 32 + 1_CSD 48 + 1_CSD N=32 1024 16 6 0 22 0 8 16 + 1CSD 24 + 1CSD

T O T A L

Mpy N=64 8192 288 2456 192 2936 64CSD 1696 288 + 128CSD 1984 + 128CSD

N=32 2048 144 584 96 824 32_CSD 464 144 + 64_CSD 508 + 96_CSD Add N=64 8064 352 5140 192 5684 64_CSD 4416 288 + 128_CSD 4704 + 192_CSD

N=32 1984 176 1284 96 1556 32CSD 1184 144 + 64CSD 1328 + 96CSD

Coeff N=64 8192 160 44 32 236 1CSD 49 64 + 1CSD 113 + 2CSD

N=32 2048 80 20 16 116 1_CSD 25 32 + 1_CSD 57 + 2_CSD TABLE II

NUMBER OF RECURSIVE CYCLES FOR VARIOUS QMFs (N=64 AND N=32)

Algorithm AQMF SQMF TOTAL(i.e., AQMF + SQMF) N=32 N=64 N=32 N=64 N=32 N=64 [6] 272 1056 144 544 416 1600

Proposed 144 544 80 288 224 832

Fig.3 Accuracy performance in PSNR of proposed fast algorithm of AQMF.

REFERENCES

[1] ETSI ES 201 980 V3.2.1, Digital Radio Mondiale (DRM); System Specification, France: European Telecommunications Standards Institute, 2012.

[2] S.-W. Huang, T.-H. Tsai, and L.-G. Chen, “Fast decomposition of filterbanks for the state-of-the-art audio coding,” IEEE Signal Process.

Lett., vol. 12, no. 10, pp. 693-696, 2005.

[3] H.-W. Hsu, C.-M. Liu and W.-C. Lee, “Fast complex quadrature mirror filterbanks for MPEG-4 HE-AAC,” Proc. 121st Conv. AES, 2006.

[4] Junqiao Huang, Gaoming Du, Duoli Zhang, Yukun Song, Luofeng Geng, and Minglun Gao, “VLSI design of resource shared complex-QMF bank for HE-AAC decoder,” IEEE 8th International Conference on ASIC, 2009. ASICON '09., pp.796,799, 20-23 Oct. 2009.

[5] R.K. Chivukula, Y.A. Reznik, V. Devarajan, M. Jayendra-Lakshman,

“Fast Algorithms for Low-Delay SBR Filterbanks in MPEG-4 AAC-ELD,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 3, pp. 1022-1031, Mar.2012.

[6] S-C Lai, M-K Lee, A-K Li, C-H Luo, and S-F Lei, “An Innovative Fast Algorithm and Structure Design for Analysis and Synthesis Quadrature Mirror Filterbanks on the SBR in DRM”, IEEE Transactions on Circuits and Systems-II: Express Briefs, vol. 60, no. 11, pp.806-810, Nov. 2013.

[7] I. Daubechies andW. Sweldens, “Factoring wavelet transforms into lifting steps,” J. Fourier Anal. Appl., vol. 4, no. 3, pp. 247–269, 1998.

[8] S.-C. Lai, S.-F. Lei, C.-L. Chang, C.-C. Lin and C.-H. Luo, “Low computational complexity, low power, and low area design for the implementation of recursive DFT and IDFT algorithms,” IEEE Trans.

Circuits Syst. II, Exp. Briefs, vol. 56, no. 12, pp.921-925, 2009.

[9] S.-C. Lai, W.-H. Juang, C.-L. Chang, C.-C. Lin, C.-H. Luo, and S.-F. Lei,

“Low-Computation cycle, Power-Efficient, and Reconfigurable Design of Recursive DFT for Portable Digital Radio Mondiale Receiver,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 57, no. 8, pp. 647–651, August 2010.

[10] S.-C. Lai, W.-H. Juang, C.-C. Lin, C.-H. Luo and S.-F. Lei,

“High-Throughput, Power-Efficient, Coefficient-Free and Reconfigurable Green Design for Recursive DFT in a Portable DRM Receiver,” International Journal of Electrical Engineering, vol. 18, no.3, pp. 137–145, June 2011.

[11] S-C Lai, Y-S Lee, and S-F Lei, “Low-Power and Optimized VLSI Implementation of Compact RDFT Processor for the Computations of DFT and IMDCT in a DRM and DRM⁺ Receiver,” J. Low Power Electron. Appl., vol. 3, no. 2, pp. 99-113, May 2013.

[12] R. Guo and L. DeBrunner,“A Novel Fast Canonical-Signed-Digit Conversion Technique for Multiplication,” In proceeding of: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp. 1637-1640.

[13] M. A. Soderstrand, “CSD multipliers for FPGA DSP applications,” in Proc. IEEE Int. Symp. Circuits Syst., 2003, pp. 469-472.