Tze-Yun Sung
Department of
Microelectronics Engineering Chung Hua University Hsinchu City 300-12, Tawan,
R.O.C.
bobsung@chu.edu.tw
Hsi-Chin Hsin
Department of Computer
Science and Information Engineering National United University Miaoli 36003, Taiwan, R.O.C.
hsin@nuu.edu.tw
Lyu-Ting Ko
Department of Electrical
Engineering Chung Hua University Hsinchu City 300-12, Tawan,
R.O.C.
m09601049@chu.edu.tw
Abstract: - This paper presents a hybrid CORDIC (COordinate Rotation DIgital Computer) algorithm for designs and implementations of the direct digital frequency synthesizer (DDFS). The proposed multiplier-less architecture with small ROM (4×16-bit) and pipelined data path provides a spurious free dynamic range (SFDR) of more than 84.4 dBc. A SoC (system on chip) has been designed by 1P6M CMOS, and then emulated on the Xilinx FPGA. It is shown that the hybrid CORDIC-based architecture is suitable for VLSI implementations of the DDFS in terms of hardware cost, power consumption, and SFDR.
Key-Words: - DDFS, hybrid CORDIC, SoC, FPGA, SFDR.
1 Introduction
The direct digital frequency synthesizer (DDFS) plays a key role in many digital communication systems. Fig. 1 depicts the conventional DDFS, which consists mainly of phase accumulator, sine/cosine generator, digital-to-analog converter, and low-pass filter. The sine/cosine generator as the core of DDFS is usually implemented by using a ROM lookup table; with high spurious free dynamic ranges (SFDR) comes a large ROM lookup table [1].
In order to reduce the size of the lookup table, many techniques were proposed [1]-[4]. The quadrant compression technique can reduce the ROM size by 75% [2]. The Sunderland architecture is to split the ROM into two smaller ones [3], and its improved version known as the Nicholas architecture results in a higher ROM-compression ratio (32:1) [4]. In [5], the polynomial hyperfolding technique with high order polynomial approximation was used to design DDFS. In [6] and [7], the angle rotation algorithm was used to design quadrature direct digital frequency synthesizer/complex mixer (QDDSM).
COordinate Rotation DIgital Computer (CORDIC) is a well known arithmetic algorithm, which evaluates various elementary functions including sine and cosine functions by using simple adders and shifters only. Thus, CORDIC is suitable for the design of high-performance chips with VLSI technologies.
Recently, the CORDIC algorithm has received a lot of attention to the design of high-performance DDFS [8]-[11], especially for the modern digital
communication systems.
This paper is organized as follows. In section II, the hybrid CORDIC algorithm is proposed. In section III, hardware implementation of DDFS is described. The performance analysis is presented in section IV.
Finally, the conclusion is given in section V.
2 The Hybrid CORDIC Algorithm
In this section, the hybrid CORDIC algorithm is proposed, and based on which, a low-power and high-SFDR DDFS can be developed.
2.1 Modified Angle Recoding Method for CORDIC Algorithm
In order to reduce the number of CORDIC iterations, the input angle can be divided into encoded angles by using the modified Booth encoding (MBE) method [12]. Specifically, let
ψ
denote the input angle represented by) 1 ( )
1
( .... ( 1)2
2 ) 1 ( 2 ) 0
( − + − + + + − − −
= f p f p f w w
ψ (1)
where f
(
i) ∈ { 0 , 1 }
, w is the word length ofoperands, and 1
3 ) 585 . 2
( = ≤ ≤ −
⎥⎥⎤
⎢⎢⎡ −
w i w p
. The MBE decomposition of
ψ
is as follows.∑
−=
=( 1)/2
2 /
) (
w
p i
β i
ψ (2) where the encoded angle:
i i
i)= ()2−
( ρ
β withρ(i)∈{−2,−1,0,1,2}. As
sin β (
i)
andcos β (
i)
can be approximated byi i
i
i
) ≅ ( ) = ( ) 2
−(
sin β β ρ
(3)The National Science Council of Taiwan, under Grant NSC97-2221-E-216-044, and the Chung Hua University, Hsinchu, Taiwan, under Contract CHU-NSC97-2221-E-216-044 supported this work.
Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems
) 1 2 ( 2 2
2 ) ( 2 1
) 1 (
) (
cos ≅ −
i= −
i − i+i
β ρ
β
(4)we have
⎥⎦
⎢ ⎤
⎣
⎡
⎥⎥
⎦
⎤
⎢⎢
⎣
⎡
−
−
= −
⎥⎦
⎢ ⎤
⎣
⎡ + +
+
−
−
− +
−
) (
) ( 2
) ( 1 2
) (
2 ) ( 2
) ( 1 ) 1 (
) 1 (
) 1 2 ( 2 )
1 2 ( 2
i y
i x i
i
i i
i y
i x
i i
i i
ρ ρ
ρ ρ
(5) i i
i z i
z
( + 1 ) = ( ) − ρ ( ) 2
− (6) Fig. 2 shows the proposed architecture for the modified scaling-free CORDIC arithmetic, in which, eight shifters, two CSAs, two CLAs, two latches, and four MUXs are used; the shifters and MUXs are to determineρ (i )
.2.2 The modified scaling-free radix-8 CORDIC Algorithm
By using the modified angle recoding method [12]-[13], the input angle
ψ
can be divided as follows.∑
−=
−
= 1 ()tan−12
w
p i
i i
φ
ψ (7) where
φ (
i) ∈ { 0 , 1 }
, andwis the word length. The CORDIC iteration is therefore represented as⎥⎦
⎢ ⎤
⎣
⎡
⎥⎥
⎦
⎤
⎢⎢
⎣
⎡
= −
⎥⎦
⎢ ⎤
⎣
⎡ + +
−
−
) (
) ( 1 2
) (
2 ) ( 1
) 1 (
) 1 (
i y
i x i
i i
y i x
i
i
φ
φ (8)
i i
i z i
z
( + 1 ) = ( ) − φ ( ) tan
−12
− (9) Leti=3n−c;c∈{
0,1,2}
. By using the Taylor series expansion, the absolute difference between) 2 (
tan−1 −(3n−c) and 2ctan−1(2−3n) is given by Λ +
⋅
=
−
= −1 −(3 −) −1 −3 2−3(3 −) 3
2 1 tan 2 2
tan n c c n n c
ς
(10) where
Λ
is the remaining terms of the difference betweentan−1(2−(3n−c))and 2ctan−1(2−3n). Thus, we have3 2 3
2−3(3n−c) = −3i
ς≅ (11) For w -bit operands,
ς
can be ignored in the following sense−w
ς ≤ 2
(12) Based on equations (11) and (12), we havew
i −
−
≤ 2
3 2
3(13)
w i+log 3≥
3 2 (14) 3
log 3
log2 w 2 w
i w ≅
⎥⎥⎤
⎢⎢⎡ −
− =
≥ (15)
As a result, when
3
i
>
w, three consecutive terms of equation (7) can be integrated into a single term as follows:) 2 ( tan ) 3 (
) 2 ( tan ) 1 3 ( ) 2 ( tan ) 2 3 (
) 3 ( 1
) 1 3 ( 1 )
2 3 ( 1
n
n n
n
n n
−
−
−
−
−
−
−
−
+
− +
− φ
φ φ
) 2 ( tan ) 2 ) 3 ( 2 ) 1 3 ( 2 ) 2 3 (
( n− ⋅ 2+ n− ⋅ 1+ n ⋅ 0 −1 −3n
= φ φ φ
) 2 ( tan )
(n −1 −3n
=ϕ (16)
where φ(⋅)∈{0,1}, and therefore ϕ(n)∈{0,1,2,L,7}. It follows that the resulting radix-8 CORDIC algorithm is represented as
⎥⎦
⎢ ⎤
⎣
⎡
⎥⎥
⎦
⎤
⎢⎢
⎣
⎡
⋅
⋅
= −
⎥⎦
⎢ ⎤
⎣
⎡ + +
−
−
) (
) ( 1
2 ) (
2 ) ( ) 1
) ( 1 (
) 1 (
3
3
8 y i
i x i
i i i K
y i x
i
i
ϕ
ϕ (17)
i i
i z i
z( +1)= ()−ϕ()tan−12−3 (18)
2 / 1 6 8(i) (1 2(i)2 i)
K = +ϕ − (19)
The scaling factor K8 is given by
⎡ ⎤ ) (
1 3 /
8
8 K i
K
w i
p i
∏
−=
=
= (20)
It can be shown that the scaling factor turns out to be equal to 1 when the input angle is less than 2−w/2, and moreover, if the input angle is less than 2−w/3, equation (18) can be rewritten as [19]
i i
i z i
z( +1)= ()−ϕ()2−3 (21) Fig. 3 depicts the proposed architecture for the modified scaling-free radix-8 CORDIC arithmetic. In which, six shifters, two CSAs, two CLAs, and two latches are used; the shifters and switches are to determine the radices for computations. Note that the number of processors is reduced, and system throughput is increased at the cost of hardware complexity.
2.3 The proposed hybrid CORDIC Algorithm The input angle
Ω
can be decomposed into a higher-angle ΩH and a lower-angleΩL represented as⎡ ⎤
⎡ ⎤
⎡ ⎤
⎡ ⎤ ⎡ ⎤
∑
∑
+ − −=
−
⋅ +
− −
=
− +
= Ω + Ω
=
Ω /2 ( )/3 1
2 /
) 4 / ( 3 1 1
2 /
0
2 ()2
2 ) (
u w u
u i
w i u u
i i L
H ρi ϕi (22)
where w is the word length with the first u bits being the most significant bits; ΩH and ΩL are computed by using the modified scaling-free CORDIC algorithm and the modified scaling-free radix-8 CORDIC algorithm, respectively. For computation efficiency, the determination of u is as follows: 1) u must be an odd number to satisfy the MBE method,
w
Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems
⎪⎪
⎩
⎪⎪
⎨
⎧
+
= +
+
= +
+
= +
+
= +
=
3 4 if , 3 2
2 4 if , 1 2
1 4 if , 1 2
0 4 if , 1 2
n w n
n w n
n w n
n w n
u (23)
Based on the above equation, the minimum iteration number of the proposed hybrid CORDIC algorithm can be obtained as shown in Fig. 4. The computations of x
(i )
and y(i )
are therefore as follows.For ≤ <⎢⎢⎡ ⎥⎥⎤ 2 2
i u
p ,
) ( 2 ) ( ) ( 2 ) ( ) ( ) 1
(i xi 2 i (4 1)xi i 2 yi
x + = −ρ − i+ −ρ −i (24)
) ( 2 ) ( ) ( 2 ) ( ) ( ) 1
(i yi 2 i (4 1)xi i 2 yi
y + = −ρ − i+ +ρ −i (25) For +⎢⎢⎡ − ⎥⎥⎤
⎥⎥⎤
⎢⎢⎡
<
⎥⎥≤
⎢⎢ ⎤
⎡
3 4
2
u w i w
u ,
⎡ ⎤ ()
2 ) ( ) ( ) 1
(i xi i 1 3( /4 1)y i
x + = −ϕ u−+⋅i−w + (26)
⎡ ⎤ ()
2 ) ( ) ( ) 1
(i yi i 1 3( /4 1)xi
y + = +ϕ u−+⋅i−w + (27)
3 Hardware Implementation of the Proposed DDFS
In this section, the DDFS implemented by using the hybrid CORDIC algorithm is presented. Fig. 5 shows the 16-bit DDFS architecture consisting mainly of phase accumulator, phase calculator, and sine/cosine generator, which is different from the conventional architecture. It is noted that the accumulated error in the sine/cosine generator is to be corrected by using the 4×16-bit correction table. Take into account DAC technology, hardware cost and practical applications, the word length of the propose DDFS is set to 16-bit.
The hybrid CORDIC-based sine/cosine generator with recursively accumulated angle
ϑ
in is given by⎥⎦
⎢ ⎤
⎣
⎥⎡
⎦
⎢ ⎤
⎣
⎡ −
⎥=
⎦
⎢ ⎤
⎣
⎡ + +
) (
) ( cos
sin
sin cos
) 1 (
) 1 (
i y
i x i
y i x
in in
in in
ϑ ϑ
ϑ
ϑ (28)
where 16
2 2π
ϑin =Δacc , and Δaccis an integer number.
(29)
For convergence, the input angle of the scale-free CORDIC algorithm is restricted as follows:
8 2 1
1
4
≅
<
∑
w− −iϑin (30) From the above two equations, we have
2 1304 216 ⋅ <
=
Δacc ϑin
π (31) The architecture for the sine/cosine generator is shown in Fig. 5. In which three modified scaling-free CORDIC arithmetic units (MCORDIC-Type A) and
two modified scaling-free radix-8 CORDIC arithmetic units (MCORDIC-Type B) are used.
According to equation (23).
The chip is synthesized by the TSMC 0.18μm1P6M CMOS cell libraries [14]. The layout view of the proposed DDFS is shown in Figure 6. The core size obtained by the Synopsys® design analyzer is612×612μm2. The power consumption obtained by the PrimePower® is 6.05 mW with a clock rate of 100MHz at 1.8V. The tuning latency is 8 clock cycles.
All the control signals are internally generated on-chip. The chip provides both high throughput and low gate count.
4 Performance Analysis of the Proposed DDFS
The number of correcting points versus the SFDRs with different (Fs/Fo)’s in the proposed DDFS is shown in Fig. 7. Due to trade-off between hardware cost and system performance, the correcting circuit with 16 points is implemented in the proposed DDFS.
Thus, the SFDR of the proposed DDFS is more than 84.4 dBc. Table 1 shows various comparisons of the proposed DDFS with other methods in [6] and [10].
As one can see, the proposed DDFS is superior in terms of SFDR, hardware cost, and power consumption.
5 Conclusion
The hybrid CORDIC-based multiplier-less DDFS architecture with small ROM and pipelined data path has been implemented. A SoC designed by 1P6M CMOS has been emulated on Xilinx XC2V6000 FPGA. For 16-bit DDFS, the SFDR of sine and cosine using the proposed architecture are more than 84.4 dBc. Simulation results show that the hybrid CORDIC-based approach is superior to the traditional approach to the design and implementation of DDFS, in terms of SFDR, power consumption, and hardware cost. The 16-bit DDFS is a reusable IP, which can be implemented in various processes with efficient uses of hardware resources for trade-offs of performance, area, and power consumption.
References:
[1] J. Vankka, “Methods of mapping from phase to sine amplitude in direct digital frequency synthesis,” IEEE Proceedings of the Frequency Control Symposium, June 5-7 1996, pp.942-950.
Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems
Table I Comparison with previous works
Phase
accumulator + sin/cos generator Digital to analog Low pass filiter converter
Phase increment
Fig. 1 The conventional DDFS architecture [2] S. C. Yi, K. T. Lee, J. J. Chen, C. H. Lin, "A low
power efficient direct digital frequency synthesizer based on new two-level lookup table,” IEEE Canadian Conference on Electrical and Computer Engineering 2006, May 2006, pp.963-966.
[3] D. A. Sunderland, R. A. Srauch, S. S. Wharfield, H. T. Peterson, C. R. Cole, "CMOS/SOS frequency synthesizer LSI circuit for spread spectrum communications,” IEEE Journal of Solid-State Circuits, Vol.19, No.4, August 1984, pp.497-506.
[4] H. T. Nicholas, H. Samueli, B. Kim, "The optimization of direct digital frequency synthesizer performance in the presence of finite word length effects," IEEE 42nd Annual Frequency Control Symposium, June 1-3 1988, pp.357-363.
[5] D. D. Caro, E. Napoli, A. G. M. Strollo, ”Direct digital frequency synthesizers with polynomial hyperfolding technique,” IEEE Transactions of Circuits and Systems-II: Express Briefs, Vol.51, No.7, July 2004, pp.337-344.
[6] D. Fu, A. N. Willson, Jr. “A high-speed processor for digital sine/cosine generation and angle rotation” in Proc. 32nd Asilomar Conf. Signals, Systems and Computers, Vol.1 1998, pp.177-181 [7] A. Torosyan, D. Fu, A. N. Willson, Jr, “A
300-MHz quadrature direct digital synthesizer/mixer in 0.25-μm CMOS” IEEE Journal of Solid-State Circuits, Volume 38, Issue 6, June 2003 pp. 875 - 887
[8] A. Madisetti, A. Y. Kwentus, A. N. Willson Jr,
"A 100-MHz, 16-b, direct digital frequency synthesizer with a 100-dBc spurious-free dynamic range,” IEEE Journal of Solid-State Circuits, Vol.34, No.8, August 1999, pp.1034-1043.
[9] E. Grayver, B. Daneshrad, “Direct digital synthesis using a modified CORDIC,” IEEE International Symposium on Circuits and Systems (ISCAS '98). Vol.5, May 1998, pp.241-244.
[10] C. Y. Kang, E. E. Swartzlander Jr.,
“Digit-pipelined direct digital frequency synthesis based on differential CORDIC,” IEEE Transactions on Circuits and Systems I: Regular Papers, Vol.53, No.5, May 2006, pp.1035-1044.
[11] T. Y. Sung, H. C. Hsin, “Design and simulation of reusable IP CORDIC core for special-purpose processors,” IET Computers & Digital Techniques, Vol.1, No.5, Sept. 2007, pp.581-589.
[12] Y. H. Hu, S. Naganathan, “An angle recoding method for CORDIC algorithm implementation,”
[13] T. B. Juang, S. F. Hsiao, M. Y. Tsai,
“Para-CORDIC: parallel CORDIC rotation algorithm,” IEEE Transactions on Circuits and Systems-I: Regular Papers, Vol.51, No.8, August 2004, pp.1515-1524.
[14] “TSMC 0.18 CMOS Design Libraries and Technical Data, v.3.2,” Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan, and National Chip Implementation Center (CIC), National Science Council, Hsinchu, Taiwan, R.O.C., 2006.
CORDIC Based DDFS
Madisetti [8] 1999
Swartzlander [10] 2006
This work 2009
Process (μm) 1.0 0.13 0.18
Core Area (mm2) 0.306 0.35 0.375 Maximum
Sampling Rate (MHz)
80.4 1018 100 Power Consumption
(mw) 40.602 350 6.056
Power Consumption (mw/MHz)
0.505 0.343 0.06
SFDR (dBc) 81 90 84.4
Output Resolution
(bits) 16 16 16
Tuning Latency
(clock cycles) 16 -- 8
Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems
1 2−
→i 2→i+1
CSA
CLA
CSA
CLA
xin yin
xout yout
Latch Latch
MUX
→i 2→i−1 2→i+1
MUX
→i
MUX MUX
−1
→i
−1
→i
−2
→i →i−1 →i
CSA
CLA
CSA
CLA
xin yin
xout yout
Latch Latch
→i
−1
→i
−2
→i
R e g
R e g
R e g Phase
Accumulator
Phase Calculator
Operation Radix Generator
Pipeline CORDIC Array
M U X
R e g 16
[15:12]
16 16
4
16 3*7 3*7
16
16 16
16 16
16
16
16
Sin output
Cos output
Δacc
Sine/Cosine Generator
4
16
16 4
Δacc
AccumulatorΔacc
16
16
Correc-tion Table
16*2
2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7
-220 -200 -180 -160 -140 -120 -100 -80 -60
X: 4 Y: -89.69
X: 4 Y: -168.6 X: 4 Y: -84.42
Correction Points log2(N)
SFDR (dBc)
Fs/Fo=32768 Fs/Fo=16384 Fs/Fo=128
ϑin
Fig. 2 The proposed architecture of modified scaling-free CORDIC arithmetic for computing θH
(MCORDIC-Type A)
Fig. 3 The architecture of modified scaling-free radix-8 CORDIC arithmetic for computing θL (MCORDIC-Type B)
Fig. 4 The 16-bit DDFS architecture
Fig. 5 The architecture of sine/cosine generator (Theϑinis an accumulated angle)
Fig. 6 The layout view of the proposed DDFS Fig. 7 Plot of the number of correcting points versus SFDRs with different (Fs/Fo)’s
Proceedings of the 8th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems