• 沒有找到結果。

High-SFDR and Multiplierless Direct Digital Frequency Synthesizer

Tze-Yun Sung

Department of

Microelectronics Engineering Chung Hua University Hsinchu City 300-12, Tawan,

R.O.C.

bobsung@chu.edu.tw

Hsi-Chin Hsin

Department of Computer

Science and Information Engineering National United University Miaoli 36003, Taiwan, R.O.C.

hsin@nuu.edu.tw

Lyu-Ting Ko

Department of Electrical

Engineering Chung Hua University Hsinchu City 300-12, Tawan,

R.O.C.

m09601049@chu.edu.tw

Abstract: - This paper presents a hybrid CORDIC (COordinate Rotation DIgital Computer) algorithm for designs and implementations of the direct digital frequency synthesizer (DDFS). The proposed multiplier-less architecture with small ROM (4×16-bit) and pipelined data path provides a spurious free dynamic range (SFDR) of more than 84.4 dBc. A SoC (system on chip) has been designed by 1P6M CMOS, and then emulated on the Xilinx FPGA. It is shown that the hybrid CORDIC-based architecture is suitable for VLSI implementations of the DDFS in terms of hardware cost, power consumption, and SFDR.

Key-Words: - DDFS, hybrid CORDIC, SoC, FPGA, SFDR.

1 Introduction

The direct digital frequency synthesizer (DDFS) plays a key role in many digital communication systems. Fig. 1 depicts the conventional DDFS, which consists mainly of phase accumulator, sine/cosine generator, digital-to-analog converter, and low-pass filter. The sine/cosine generator as the core of DDFS is usually implemented by using a ROM lookup table; with high spurious free dynamic ranges (SFDR) comes a large ROM lookup table [1].

In order to reduce the size of the lookup table, many techniques were proposed [1]-[4]. The quadrant compression technique can reduce the ROM size by 75% [2]. The Sunderland architecture is to split the ROM into two smaller ones [3], and its improved version known as the Nicholas architecture results in a higher ROM-compression ratio (32:1) [4]. In [5], the polynomial hyperfolding technique with high order polynomial approximation was used to design DDFS. In [6] and [7], the angle rotation algorithm was used to design quadrature direct digital frequency synthesizer/complex mixer (QDDSM).

COordinate Rotation DIgital Computer (CORDIC) is a well known arithmetic algorithm, which evaluates various elementary functions including sine and cosine functions by using simple adders and shifters only. Thus, CORDIC is suitable for the design of high-performance chips with VLSI technologies.

Recently, the CORDIC algorithm has received a lot of attention to the design of high-performance DDFS [8]-[11], especially for the modern digital

communication systems.

This paper is organized as follows. In section II, the hybrid CORDIC algorithm is proposed. In section III, hardware implementation of DDFS is described. The performance analysis is presented in section IV.

Finally, the conclusion is given in section V.

2 The Hybrid CORDIC Algorithm

In this section, the hybrid CORDIC algorithm is proposed, and based on which, a low-power and high-SFDR DDFS can be developed.

2.1 Modified Angle Recoding Method for CORDIC Algorithm

In order to reduce the number of CORDIC iterations, the input angle can be divided into encoded angles by using the modified Booth encoding (MBE) method [12]. Specifically, let

ψ

denote the input angle represented by

) 1 ( )

1

( .... ( 1)2

2 ) 1 ( 2 ) 0

( + + + +

= f p f p f w w

ψ (1)

where f

(

i

) ∈ { 0 , 1 }

, w is the word length of

operands, and 1

3 ) 585 . 2

( =

⎥⎥

⎢⎢

w i w p

. The MBE decomposition of

ψ

is as follows.

=

=( 1)/2

2 /

) (

w

p i

β i

ψ (2) where the encoded angle:

i i

i)= ()2

( ρ

β withρ(i){2,1,0,1,2}. As

sin β (

i

)

and

cos β (

i

)

can be approximated by

i i

i

i

) ≅ ( ) = ( ) 2

(

sin β β ρ

(3)

The National Science Council of Taiwan, under Grant NSC97-2221-E-216-044, and the Chung Hua University, Hsinchu, Taiwan, under Contract CHU-NSC97-2221-E-216-044 supported this work.

Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems

) 1 2 ( 2 2

2 ) ( 2 1

) 1 (

) (

cos ≅ −

i

= −

i i+

i

β ρ

β

(4)

we have

⎥⎦

⎢ ⎤

⎥⎥

⎢⎢

= −

⎥⎦

⎢ ⎤

⎡ + +

+

+

) (

) ( 2

) ( 1 2

) (

2 ) ( 2

) ( 1 ) 1 (

) 1 (

) 1 2 ( 2 )

1 2 ( 2

i y

i x i

i

i i

i y

i x

i i

i i

ρ ρ

ρ ρ

(5) i i

i z i

z

( + 1 ) = ( ) − ρ ( ) 2

(6) Fig. 2 shows the proposed architecture for the modified scaling-free CORDIC arithmetic, in which, eight shifters, two CSAs, two CLAs, two latches, and four MUXs are used; the shifters and MUXs are to determine

ρ (i )

.

2.2 The modified scaling-free radix-8 CORDIC Algorithm

By using the modified angle recoding method [12]-[13], the input angle

ψ

can be divided as follows.

=

= 1 ()tan12

w

p i

i i

φ

ψ (7) where

φ (

i

) ∈ { 0 , 1 }

, andwis the word length. The CORDIC iteration is therefore represented as

=

+ +

) (

) ( 1 2

) (

2 ) ( 1

) 1 (

) 1 (

i y

i x i

i i

y i x

i

i

φ

φ (8)

i i

i z i

z

( + 1 ) = ( ) − φ ( ) tan

1

2

(9) Leti=3nc;c

{

0,1,2

}

. By using the Taylor series expansion, the absolute difference between

) 2 (

tan1 (3n−c) and 2ctan1(23n) is given by Λ +

=

= 1 (3 ) 1 3 23(3 ) 3

2 1 tan 2 2

tan n c c n n c

ς

(10) where

Λ

is the remaining terms of the difference betweentan1(2(3n−c))and 2ctan1(23n). Thus, we have

3 2 3

23(3nc) = 3i

ς (11) For w -bit operands,

ς

can be ignored in the following sense

w

ς ≤ 2

(12) Based on equations (11) and (12), we have

w

i

≤ 2

3 2

3

(13)

w i+log 3≥

3 2 (14) 3

log 3

log2 w 2 w

i w

⎥⎥⎤

⎢⎢⎡ −

− =

≥ (15)

As a result, when

3

i

>

w, three consecutive terms of equation (7) can be integrated into a single term as follows:

) 2 ( tan ) 3 (

) 2 ( tan ) 1 3 ( ) 2 ( tan ) 2 3 (

) 3 ( 1

) 1 3 ( 1 )

2 3 ( 1

n

n n

n

n n

+

+

φ

φ φ

) 2 ( tan ) 2 ) 3 ( 2 ) 1 3 ( 2 ) 2 3 (

( n 2+ n 1+ n 0 1 3n

= φ φ φ

) 2 ( tan )

(n 1 3n

=ϕ (16)

where φ(){0,1}, and therefore ϕ(n){0,1,2,L,7}. It follows that the resulting radix-8 CORDIC algorithm is represented as

=

+ +

) (

) ( 1

2 ) (

2 ) ( ) 1

) ( 1 (

) 1 (

3

3

8 y i

i x i

i i i K

y i x

i

i

ϕ

ϕ (17)

i i

i z i

z( +1)= ()ϕ()tan123 (18)

2 / 1 6 8(i) (1 2(i)2 i)

K = +ϕ (19)

The scaling factor K8 is given by

) (

1 3 /

8

8 K i

K

w i

p i

=

=

= (20)

It can be shown that the scaling factor turns out to be equal to 1 when the input angle is less than 2w/2, and moreover, if the input angle is less than 2w/3, equation (18) can be rewritten as [19]

i i

i z i

z( +1)= ()ϕ()23 (21) Fig. 3 depicts the proposed architecture for the modified scaling-free radix-8 CORDIC arithmetic. In which, six shifters, two CSAs, two CLAs, and two latches are used; the shifters and switches are to determine the radices for computations. Note that the number of processors is reduced, and system throughput is increased at the cost of hardware complexity.

2.3 The proposed hybrid CORDIC Algorithm The input angle

Ω

can be decomposed into a higher-angle ΩH and a lower-angleΩL represented as

⎤ ⎡

+

=

+

=

+

= Ω + Ω

=

Ω /2 ( )/3 1

2 /

) 4 / ( 3 1 1

2 /

0

2 ()2

2 ) (

u w u

u i

w i u u

i i L

H ρi ϕi (22)

where w is the word length with the first u bits being the most significant bits; ΩH and ΩL are computed by using the modified scaling-free CORDIC algorithm and the modified scaling-free radix-8 CORDIC algorithm, respectively. For computation efficiency, the determination of u is as follows: 1) u must be an odd number to satisfy the MBE method,

w

Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems

+

= +

+

= +

+

= +

+

= +

=

3 4 if , 3 2

2 4 if , 1 2

1 4 if , 1 2

0 4 if , 1 2

n w n

n w n

n w n

n w n

u (23)

Based on the above equation, the minimum iteration number of the proposed hybrid CORDIC algorithm can be obtained as shown in Fig. 4. The computations of x

(i )

and y

(i )

are therefore as follows.

For <⎢⎢ ⎥⎥ 2 2

i u

p ,

) ( 2 ) ( ) ( 2 ) ( ) ( ) 1

(i xi 2 i (4 1)xi i 2 yi

x + = −ρ i+ −ρ i (24)

) ( 2 ) ( ) ( 2 ) ( ) ( ) 1

(i yi 2 i (4 1)xi i 2 yi

y + = −ρ i+i (25) For +⎢⎢⎡ − ⎥⎥

⎥⎥⎤

⎢⎢⎡

<

⎥⎥≤

⎢⎢ ⎤

3 4

2

u w i w

u ,

()

2 ) ( ) ( ) 1

(i xi i 1 3( /4 1)y i

x + = ϕ u+iw + (26)

()

2 ) ( ) ( ) 1

(i yi i 1 3( /4 1)xi

y + = +ϕ u+iw + (27)

3 Hardware Implementation of the Proposed DDFS

In this section, the DDFS implemented by using the hybrid CORDIC algorithm is presented. Fig. 5 shows the 16-bit DDFS architecture consisting mainly of phase accumulator, phase calculator, and sine/cosine generator, which is different from the conventional architecture. It is noted that the accumulated error in the sine/cosine generator is to be corrected by using the 4×16-bit correction table. Take into account DAC technology, hardware cost and practical applications, the word length of the propose DDFS is set to 16-bit.

The hybrid CORDIC-based sine/cosine generator with recursively accumulated angle

ϑ

in is given by

⎥⎦

⎢ ⎤

⎥⎡

⎢ ⎤

⎡ −

⎥=

⎢ ⎤

⎡ + +

) (

) ( cos

sin

sin cos

) 1 (

) 1 (

i y

i x i

y i x

in in

in in

ϑ ϑ

ϑ

ϑ (28)

where 16

2 2π

ϑinacc , and Δaccis an integer number.

(29)

For convergence, the input angle of the scale-free CORDIC algorithm is restricted as follows:

8 2 1

1

4

<

w i

ϑin (30) From the above two equations, we have

2 1304 216 ⋅ <

=

Δacc ϑin

π (31) The architecture for the sine/cosine generator is shown in Fig. 5. In which three modified scaling-free CORDIC arithmetic units (MCORDIC-Type A) and

two modified scaling-free radix-8 CORDIC arithmetic units (MCORDIC-Type B) are used.

According to equation (23).

The chip is synthesized by the TSMC 0.18μm1P6M CMOS cell libraries [14]. The layout view of the proposed DDFS is shown in Figure 6. The core size obtained by the Synopsys® design analyzer is612×612μm2. The power consumption obtained by the PrimePower® is 6.05 mW with a clock rate of 100MHz at 1.8V. The tuning latency is 8 clock cycles.

All the control signals are internally generated on-chip. The chip provides both high throughput and low gate count.

4 Performance Analysis of the Proposed DDFS

The number of correcting points versus the SFDRs with different (Fs/Fo)’s in the proposed DDFS is shown in Fig. 7. Due to trade-off between hardware cost and system performance, the correcting circuit with 16 points is implemented in the proposed DDFS.

Thus, the SFDR of the proposed DDFS is more than 84.4 dBc. Table 1 shows various comparisons of the proposed DDFS with other methods in [6] and [10].

As one can see, the proposed DDFS is superior in terms of SFDR, hardware cost, and power consumption.

5 Conclusion

The hybrid CORDIC-based multiplier-less DDFS architecture with small ROM and pipelined data path has been implemented. A SoC designed by 1P6M CMOS has been emulated on Xilinx XC2V6000 FPGA. For 16-bit DDFS, the SFDR of sine and cosine using the proposed architecture are more than 84.4 dBc. Simulation results show that the hybrid CORDIC-based approach is superior to the traditional approach to the design and implementation of DDFS, in terms of SFDR, power consumption, and hardware cost. The 16-bit DDFS is a reusable IP, which can be implemented in various processes with efficient uses of hardware resources for trade-offs of performance, area, and power consumption.

References:

[1] J. Vankka, “Methods of mapping from phase to sine amplitude in direct digital frequency synthesis,” IEEE Proceedings of the Frequency Control Symposium, June 5-7 1996, pp.942-950.

Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems

Table I Comparison with previous works

Phase

accumulator + sin/cos generator Digital to analog Low pass filiter converter

Phase increment

Fig. 1 The conventional DDFS architecture [2] S. C. Yi, K. T. Lee, J. J. Chen, C. H. Lin, "A low

power efficient direct digital frequency synthesizer based on new two-level lookup table,” IEEE Canadian Conference on Electrical and Computer Engineering 2006, May 2006, pp.963-966.

[3] D. A. Sunderland, R. A. Srauch, S. S. Wharfield, H. T. Peterson, C. R. Cole, "CMOS/SOS frequency synthesizer LSI circuit for spread spectrum communications,” IEEE Journal of Solid-State Circuits, Vol.19, No.4, August 1984, pp.497-506.

[4] H. T. Nicholas, H. Samueli, B. Kim, "The optimization of direct digital frequency synthesizer performance in the presence of finite word length effects," IEEE 42nd Annual Frequency Control Symposium, June 1-3 1988, pp.357-363.

[5] D. D. Caro, E. Napoli, A. G. M. Strollo, ”Direct digital frequency synthesizers with polynomial hyperfolding technique,” IEEE Transactions of Circuits and Systems-II: Express Briefs, Vol.51, No.7, July 2004, pp.337-344.

[6] D. Fu, A. N. Willson, Jr. “A high-speed processor for digital sine/cosine generation and angle rotation” in Proc. 32nd Asilomar Conf. Signals, Systems and Computers, Vol.1 1998, pp.177-181 [7] A. Torosyan, D. Fu, A. N. Willson, Jr, “A

300-MHz quadrature direct digital synthesizer/mixer in 0.25-μm CMOS” IEEE Journal of Solid-State Circuits, Volume 38, Issue 6, June 2003 pp. 875 - 887

[8] A. Madisetti, A. Y. Kwentus, A. N. Willson Jr,

"A 100-MHz, 16-b, direct digital frequency synthesizer with a 100-dBc spurious-free dynamic range,” IEEE Journal of Solid-State Circuits, Vol.34, No.8, August 1999, pp.1034-1043.

[9] E. Grayver, B. Daneshrad, “Direct digital synthesis using a modified CORDIC,” IEEE International Symposium on Circuits and Systems (ISCAS '98). Vol.5, May 1998, pp.241-244.

[10] C. Y. Kang, E. E. Swartzlander Jr.,

“Digit-pipelined direct digital frequency synthesis based on differential CORDIC,” IEEE Transactions on Circuits and Systems I: Regular Papers, Vol.53, No.5, May 2006, pp.1035-1044.

[11] T. Y. Sung, H. C. Hsin, “Design and simulation of reusable IP CORDIC core for special-purpose processors,” IET Computers & Digital Techniques, Vol.1, No.5, Sept. 2007, pp.581-589.

[12] Y. H. Hu, S. Naganathan, “An angle recoding method for CORDIC algorithm implementation,”

[13] T. B. Juang, S. F. Hsiao, M. Y. Tsai,

“Para-CORDIC: parallel CORDIC rotation algorithm,” IEEE Transactions on Circuits and Systems-I: Regular Papers, Vol.51, No.8, August 2004, pp.1515-1524.

[14] “TSMC 0.18 CMOS Design Libraries and Technical Data, v.3.2,” Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan, and National Chip Implementation Center (CIC), National Science Council, Hsinchu, Taiwan, R.O.C., 2006.

CORDIC Based DDFS

Madisetti [8] 1999

Swartzlander [10] 2006

This work 2009

Process (μm) 1.0 0.13 0.18

Core Area (mm2) 0.306 0.35 0.375 Maximum

Sampling Rate (MHz)

80.4 1018 100 Power Consumption

(mw) 40.602 350 6.056

Power Consumption (mw/MHz)

0.505 0.343 0.06

SFDR (dBc) 81 90 84.4

Output Resolution

(bits) 16 16 16

Tuning Latency

(clock cycles) 16 -- 8

Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems

1 2

i 2i+1

CSA

CLA

CSA

CLA

xin yin

xout yout

Latch Latch

MUX

i 2i1 2i+1

MUX

i

MUX MUX

1

i

1

i

2

i i1 i

CSA

CLA

CSA

CLA

xin yin

xout yout

Latch Latch

i

1

i

2

i

R e g

R e g

R e g Phase

Accumulator

Phase Calculator

Operation Radix Generator

Pipeline CORDIC Array

M U X

R e g 16

[15:12]

16 16

4

16 3*7 3*7

16

16 16

16 16

16

16

16

Sin output

Cos output

Δacc

Sine/Cosine Generator

4

16

16 4

Δacc

AccumulatorΔacc

16

16

Correc-tion Table

16*2

2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7

-220 -200 -180 -160 -140 -120 -100 -80 -60

X: 4 Y: -89.69

X: 4 Y: -168.6 X: 4 Y: -84.42

Correction Points log2(N)

SFDR (dBc)

Fs/Fo=32768 Fs/Fo=16384 Fs/Fo=128

ϑin

Fig. 2 The proposed architecture of modified scaling-free CORDIC arithmetic for computing θH

(MCORDIC-Type A)

Fig. 3 The architecture of modified scaling-free radix-8 CORDIC arithmetic for computing θL (MCORDIC-Type B)

Fig. 4 The 16-bit DDFS architecture

Fig. 5 The architecture of sine/cosine generator (Theϑinis an accumulated angle)

Fig. 6 The layout view of the proposed DDFS Fig. 7 Plot of the number of correcting points versus SFDRs with different (Fs/Fo)’s

Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems

相關文件