Experimental Results - Fast Lock-In All-Digital Phase-Locked Loop Design

Chapter 3 Fast Lock-In All-Digital Phase-Locked Loop Design

3.5 Experimental Results

The proposed ADPLL’s are designed and implemented by 0.13µm CMOS standard cell library and cell-based design flow, thus the proposed architecture is modeled in Hardware Description Language (HDL) and functionally verified using NC-Verilog simulator. Moreover, we also use transistor-level simulator with Hspice to verify the performance of the timing critical circuits including DCO, PFD and TDC.

To achieve high performance and low power, the proposed binary search ADPLL and TDC-based ADPLL use the ultra-low power DCO as described in Chapter 2.

The simulation of the binary search ADPLL is shown in Fig. 3.10. The frequency of the reference clock is 20MHz, and the division ratio is 10, thus the frequency of the ADPLL output clock is 200MHz (=20MHz * 10). When the ADPLL controller

Fig. 3.11: Transient response of TDC-based ADPLL.

- 42 -

receives the “lead” or “lag” signal from the PFD, the DCO control code will be decreased or increased respectively, and the frequency of DCO will be changed too.

In the Fig. 3.10, we can see that either the tracking DCO control code or the average DCO control code will be converged to a stable value and complete the lock function.

Fig. 3.11 shows the transient response of the proposed TDC-based ADPLL, where the reference clock is 20MHz, and the division ratio (M) is 10. Thus the output frequency is 200MHz (=20MHz * 10). The TDC takes 2 reference clock cycles to complete coarse lock-in operation and 27 cycles to align phase. After TDC operation is completed, the DCO control code is changed by PFD output frequency to generate desired DCO output frequency. As shown in Fig. 3.11, the DCO control code will be converged to a stable value and complete the lock function. The simulation results show the power consumption is 250µW at 200MHz and 1.2V.

3.6 Summary

In this chapter, the binary search algorithm and the proposed TDC-based ADPLL have been presented. Because the locking time of TDC-based ADPLL can be reduced to 29 input clock cycles by the novel 2-level flash TDC, it is very suitable for fast lock-in applications. By the 2-level architecture, the hardware cost of the proposed TDC can be saved significantly. In addition, since all designs of the proposed ADPLL are described with HDL language, it can be ported to different processes, making our proposal very suitable for system-level and SoC applications.

- 43 -

Chapter 4 All Digital Spread Spectrum Clock Generator Design

4.1 Introduction

As the operating frequency of electronic systems increases, the electromagnetic interference (EMI) effect becomes a serious problem especially in consumer electronics, microprocessor (µP) based systems, and data transmission circuits [26].

The radiated emissions of system should be kept below an acceptable level to ensure the functionality and performance of system and adjacent devices [26], [27]. Many approaches have been proposed to reduce EMI, such as shielding box, skew-rate control, and spread spectrum clock generator (SSCG). However, the SSCG has lower hardware cost as compared with other approaches. As a result, the SSCG becomes the most popular solution among EMI reduction techniques for System-On-Chip (SoC) applications [6], [27]-[28].

Recently, different architectural solutions have been developed to implement SSCG. In [28], [29], a triangular modulation scheme which modulates the control voltage of a voltage-controlled oscillator (VCO) is proposed to provide good performance in EMI reduction. However, it requires a large loop filter capacitor to pass modulated signal in the phase-locked loop (PLL), resulting in increasing chip

- 44 -

area or requirement for an off-chip capacitor. Modulation on PLL loop divider is another important SSCG type that utilizes a fractional-N PLL with delta-sigma modulator to spread output frequency changing the divider ratio in PLL [30], [31].

However, fractional-N type SSCG not only needs large loop capacitor to filter the quantization noise from the divider, but also induces the stability issue for the wide frequency spreading ratio applications, especially in PC related applications [31].

In contrast, all-digital SSCG (ADSSCG) [32], [33] does not utilize any passive components and use digital design approaches, making it easily be integrated into digital systems. However, the delay line type ADSSCG [32] does not have the programmable spreading ratio functionality and needs an extra PLL to provide the frequency multiplication function. And the triangular modulation ADSSCG [33] has poor phase tracking capability resulting in loss of lock and stability issues. Moreover, it utilizes a delay non-monotonic digitally controlled oscillator (DCO) that is not suitable for SSCG application. Thus in this chapter, a portable, low-power, and programmable spreading ratio ADSSCG with monotonic DCO is presented.

The proposed ADSSCG employs a novel rescheduling division triangular modulation (RDTM) to enhance the phase tracking capability and provide wide programmable spreading ratio. The proposed low-power DCO with auto-adjustment algorithm saves the power consumption while keeping delay monotonic characteristic.

This chapter is organized as follows. Section 4.2 describes the proposed architecture and spread spectrum algorithm of ADSSCG. Section 4.3 focuses on the low-power DCO design and the auto-adjustment algorithm for monotonic delay characteristic. In Section 4.4, the implementation and measurement results of the fabricated ADSSCG chip are presented. Finally, a brief summary is addressed in Section 4.5.

- 45 -

4.2 The Proposed ADSSCG Design

4.2.1 ADSSCG Architecture Overview

Fig. 4.1 illustrates the architecture of the proposed ADSSCG. It consists of five major functional blocks: a phase/frequency detector (PFD), an ADSSCG controller, a DCO, and two frequency dividers. The ADSSCG controller consists of a modulation controller, a loop filter, and a DCO code generator (DCG). The ADSSCG can provide the clock signal with or without spread-spectrum function based on the operation mode signal (MODE) setting. In the normal operation mode, the bang-bang PFD detects the phase and frequency difference between FIN_M and DCO_N. When the loop filter receives LEAD from the PFD, the DCG adds a current search step (S_N[15:0]) to the DCO control code, and this decreases the output frequency of the DCO. Oppositely, when the loop filter receives LAG from the PFD, the DCG subtracts the DCO control code to increase the output frequency of the DCO. When PFD output changes from LEAD to LAG or vice versa, the loop filter sends the code-loading signal (LOAD) to DCG to load the baseline code (BASELINE CODE [17:0]) which is averaged DCO control code by the loop filter. Before ADSSCG enters the spread spectrum operation mode, the baseline frequency will be stored as the center frequency. In the spread spectrum operation mode, the modulation controller uses two spreading control signals (SEC_SEL[2:0] and STEP[2:0]) to generate the add/subtract signal (+/-_SS) and the spreading step (S_SS[15:0]) for the DCG, and then it modulates the DCO control code to spread out the DCO output frequency around the center frequency evenly.

- 46 -

The system clock of ADSSCG controller is FIN_M whose operating frequency is limited by ADSSCG’s closed-loop response time which is determined by the response time of the DCO, the delay time of the ADSSCG controller, and the frequency divider.

Therefore, the period of FIN_M should not be shorter than the shortest response time to ensure the ADSSCG functionality and performance. In addition, because the frequency of DCO_N should be the same as FIN_M after system locking, the frequency of FIN_M can not be higher than the maximum frequency or lower than the minimum frequency of DCO_N. As a result, the frequency range of FIN_M is also limited by the DCO operating range and the divider ratio (N).

4.2.2 Spread Spectrum Algorithm

Since triangular modulation is easy to be implemented and has good performance in reduction of radiated emissions, it becomes the major modulation method for SSCG [6], [28]. In triangular modulation, the EMI attenuation depends on the frequency-spreading ratio and center frequency, and it can be formulated as

DCO_N

DCO OUTPUT FIN

DCO CODE[17:0]

Pre-Divider

(M[7:0])

PFD

Divider (N[7:0])^Feedback

FIN_M

LAG LEAD

DCO

ADSSCG Controller *: BASELINE CODE[17:0]

DCO Code Generator (DCG)

Fig. 4.1: Architecture of the proposed ADSSCG.

- 47 -

A_dB =I+Jlog(SR/100)+Klog(F_C) (4.1)

where

A

_dB is the EMI attenuation, SR is the frequency spreading ratio,

F

_C is the center frequency, and I, J, K are modulation parameters [27]. Based on (4.1), under the same center frequency, EMI can be reduced further by increasing spreading ratio.

Tmax

Tmin

(a)

Tmax

Tmin

(b)

Tmax

Tmin

(c)

Fig. 4.2: (a) Conventional triangular modulation. (b) Division triangular modulation.

- 48 -

In addition, under the same spreading ratio, the higher center frequency has better EMI attenuation performance.

Fig. 4.2(a) illustrates the conventional triangular modulation with digital approach [33]. Since the output frequency can be changed by the DCO control code, the output clock frequency can be spread by tuning DCO control codes with triangular modulation within one modulation cycle. In the beginning of the conventional spread spectrum, it will start at center frequency (Tc) and take one-fourth of the modulation cycle time to reach the minimum frequency (T_max), and then takes half of the modulation cycle time to reach the maximum frequency (Tmin). Finally, it will return to the center frequency in the last one-fourth modulation cycle time.

Because the upper half and lower half in the triangle have the same area, as shown in Fig. 4.2(a), the mean frequency of the spreading clock is equal to center frequency and the phase drift will be zero in the end of each modulation cycle.

However, in the conventional triangular modulation, the ADSSCG controller can only perform phase and frequency maintenance based on the PFD’s output in the end of each modulation cycle. Hence due to the frequency error between reference clock and output clock, reference clock jitter and supply noise, the phase error will be accumulated within one modulation cycle, leading to induce the loss of lock and stability problems.

Thus, in order to enhance phase tracking ability, the division triangular modulation (DTM) is proposed as shown in Fig. 4.2(b). DTM divides one modulation cycle into many sub-sections (for example in Fig. 4.2(b), modulation cycle divides into 16 sub-sections) and updates DCO control code for phase tracking in every 4

- 49 -

sub-sections. As a result, the ADSSCG controller can perform four times phase and frequency maintenance in one modulation cycle when modulation cycle divides into 16 sub-sections. Because DTM can provide the frequency spreading function and keep phase tracking at the same time, it is very suitable for ADSSCG in µP-based system applications. However the disadvantage of DTM is when the frequency changes to different sub-sections; it will induce large DCO control code fluctuations (7S) as shown in Fig. 4.2(b), where S is the spreading step of DCO control code in spreading modulation.

In order to reduce the peak-to-peak value of DCO control code changing in DTM, the rescheduling DTM (RDTM) is proposed as shown in Fig. 4.2(c). By reordering the sub-sections in DTM, the peak-to-peak value of DCO control code changing can be reduced to 5S. As a result, the peak-to-peak value of cycle-to-cycle jitter can be reduced while the period jitter is kept the same. Compared with DTM, the reduction ratio of peak-to-peak jitter by RDTM is related with number of sub-section, and it can be formulated as

where JR is the jitter reduction ratio, COUNT is number of sub-sections. For example, if there are 16 sub-sections, the jitter reduction ratio is 29% ((7-5)/7), and if the number of sub-section is 32, the jitter reduction ratio is 40% ((15-9)/15). Although the order of sub-sections of DTM is rescheduled by RDTM to reduce the peak cycle-to-cycle jitter, the average cycle-to-cycle jitter still keeps the same as DTM.

Besides, because the phase drift of the opposite direction in DTM and RDTM remains the same, the equivalent phase drift is zero. As a result, it will not induce an extra

- 50 -

phase drift while the mean frequency remains the same. The results of frequency spread of DTM and RDTM are the same as the conventional triangular modulation.

Table 4.1 summarizes the jitter and timing comparisons of DTM and RDTM with 16 sub-sections within one modulation cycle.

With two control signals, spreading step (S) and number of sub-sections (COUNT), the proposed RDTM can provide a flexible spreading ratio for different system requirements. Spreading step is the difference of DCO control code between two consecutive sub-sections. Number of sub-sections determines how many sub-sections in one modulation cycle. COUNT and S decoded from SEC_SEL and

STEP by the modulation controller, respectively. Based on the definitions, the

frequency-spreading ratio equation can be given as

% Table 4.1: Jitter and Timing Comparisons of DTM and RDTM

DTM RDTM

Cycle-to-Cycle Jitter (S)

(1+1+1+2+1+3+1+4+ Peak Cycle-to-Cycle Jitter (S) 7 5

*: S times Period of Sub-Section

- 51 -

where SR is the spreading ratio, RES is the finest time resolution of DCO, and

T

_C is the center period of DCO output clock. As a result, the frequency-spreading ratio of the proposed ADSSCG can be specified by the control signals easily.

4.3 DCO Design

4.3.1 DCO Architecture

Because digitally controlled oscillator (DCO) occupies over 50% power consumption in all-digital clocking circuits, the proposed ADSSCG utilizes the proposed low-power DCO structure as described in Chapter 2 to reduce overall power consumption [34]. To achieve the high portability of the proposed ADSSCG, all components in this ADSSCG including DCO are implemented with standard cells.

COARSE-TUNING STAGE

2^nd FINE-TUNE (32 Long-Delay DCVs) 3^rd FINE-TUNE (8 Short-Delay DCVs)

(a)

- 52 -

Fig. 4.3(a) illustrates the architecture of the proposed low-power DCO which employs cascading structure for one coarse-tuning and three fine-tuning stages to achieve a fine frequency resolution and wide operation range. As the number of delay cell in the coarse-tuning stage increases, leading to have a longer propagation delay, the operating frequency of DCO becomes lower. The shortest delay path that consists of one NAND gate, one path MUX of coarse-tuning stage, and fine-tuning stage at the minimum delay determines the highest operation frequency of DCO. There are 2^C different delay paths in the coarse-tuning stage and only one path is selected by the 2^C-to-1 path selector MUX controlled by C-bit DCO control code. The coarse-tuning delay cell utilizes a two-input AND gate which can be disabled when the DCO operates at high frequency to save power. In order to increase the frequency resolution of DCO, the three fine-tuning stages which are controlled by F-bit DCO control code are added into the DCO design. The 1^st fine-tuning stage is composed of X hysteresis delay cells (HDC), and each of which contains one inverter and one tri-state inverter as shown in Fig. 4.3(b). When the tri-state inverter in HDC is enabled, the output signal of enabled tri-state inverter has the hysteresis phenomenon to increase delay [34]. Different digitally controlled varactors (DCV’s) are exploited in the 2^nd and 3^rd fine-tuning stages to further improve the overall resolution of DCO as shown in Fig.

Table 4.2: Simulation Results of Delay of Tuning Stage

Coarse-Tuning 1^st Fine-Tuning 2^nd Fine-Tuning 3^rd Fine-Tuning Controllable Delay

Range (ps) 61812 308.45 121.58 7.73

Finest Delay

Step (ps) 242.41 102.82 3.92 1.1

- 53 -

4.3(b). The operation concept of DCV is to control the gate capacitance of logic gate with enable signal state to adjust the delay time. The 2^nd and 3^rd fine-tuning stages employ Y long-delay DCV cells and Z short-delay DCV cells respectively. Since the HDC can replace many DCV cells to obtain wider operation range, the number of delay cells connected with each driving buffer and loading capacitance can be reduced, leading to save power consumption and gate count as well.

Based on an in-house µP-based system for liquid crystal display (LCD) controller applications [35], the requested operating frequency is from 27MHz to 54MHz. Thus the design parameters of the proposed DCO are determined as follows:

C=8, F=10, X=4, Y=32 and Z=8. Table 4.2 shows controllable delay range and the finest delay step of different tuning stages in the proposed DCO under typical case (typical corner, 1.8V, 25°C). It should be noted that the controllable delay range of each stage is larger than the finest delay step of the previous stage. As a result, the cascading DCO structure does not have any dead zone larger than the LSB resolution of DCO. Since the finest delay step of the 3^rd fine-tuning stage determines the overall resolution, the proposed DCO can achieve resolution up to 1.1ps.

4.3.2 Auto-Adjustment Algorithm for Monotonic DCO

As mentioned in the previous section, the DCO control code will be changed to obtain the different output periods in the spread spectrum applications, thus the monotonic characteristic of DCO is very important. Because the controllable delay range of each stage must be larger than the finest delay step of the previous stage, non-monotonic problem will occur when DCO code switches at the boundary of different tuning stages. To eliminate such non-ideal effects, an adjustable algorithm

- 54 -

for boundary code switching is proposed. Fig. 4.4 is the flowchart of the proposed algorithm. When the DCO code crosses the boundary of different tuning stages, the DCO code will be adjusted by the ADSSCG controller to eliminate the non-monotonic issue automatically. If DCO code changes across boundary of different tuning stages, the original code will add or subtract the extra compensation code to reduce the delay difference caused by tuning stages switching. According to

Fig. 4.4: Flowchart of auto-adjustment algorithm.

5050 5052 5054 5056 5058 5060

9777 9778 9779 9780 9781 9782 9783 9784 9785 9786 9787 DCO Control Code

Delay (ps)

Fig. 4.5: Comparison between original and adjusted timing.

- 55 -

the simulation results of our proposed DCO under different process-voltage-temperature (PVT) conditions, the extra compensation code of across coarse/1^stfine, 1^st /2^nd fine, and 2^nd /3^rd fine-tuning stage can be defined as 320, 48, and 4 respectively. For example, when the last four bits of DCO code (including one bit for 2^nd fine-tuning stage and last three bits for 3^rd fine-tuning stage) changes from

0111

( to(1000)₂, the delay should increase 1.1ps ideally, but it decreases 3.78ps (from 7.7ps to 3.92ps which is the delay of one 2^nd fine-tuning cell) instead. Based on the auto-adjustment algorithm, the code will be adjusted from (1000)₂to(1100)₂. As a result, the delay will increase 0.62ps, leading to operate in a monotonic way as shown in Fig. 4.5.

4.4 Experimental Results and Comparisons

Based on the requested operating frequency for an in-house µP-based system and LCD controller [35] applications, the proposed ADSSCG should generate output clock ranges from 27MHz to 54MHz. The proposed ADSSCG is designed and

Fig. 4.6: Microphotograph of ADSSCG test chip.

- 56 -

implemented by cell-based design flow, thus the proposed architecture and spread spectrum algorithm are modeled in Hardware Description Language (HDL) and functionally verified using NC-Verilog simulator. Moreover, we also use transistor-level simulator with Hspice to verify the DCO performance. Because the

(a) (b)

Fig. 4.7: Measurement spectrum of 54MHz (a) Without frequency spreading (b) With 1% frequency spreading.

(a) (b)

Fig. 4.8: Measurement spectrum of 27MHz (a) Without frequency spreading (b) With 10% frequency spreading.

- 57 -

proposed ADSSCG is implemented with standard cells, the physical layout is generated by the auto placement and routing (APR) tool.

A test chip has been fabricated in 0.18µm 1P6M CMOS process with area of 0.156mm², where the chip microphotograph is shown in Fig. 4.6. The ADSSCG output signal is measured using Agilent E4440A spectrum analyzer at 1.8V/25°C to test the performance. The input clock frequency is from 13.5MHz to 27MHz. The total current consumption is 0.69mA at frequency of 54MHz. Fig. 4.7 shows the reduction of peak power is 9.5dB at 54MHz with 1% of spreading ratio, and the reduction of peak power is 15dB at 27MHz with 10% of spreading ratio is shown as Fig. 4.8. Figs. 4.7 and 4.8 shows the EMI can be reduced at the maximum and minimum operation frequency of the proposed design, respectively. Because RDTM is a kind of the triangular modulation, some peaks are happened in spectrum [27]. For

在文檔中應用於系統晶片之低功率全數位式時脈產生器 (頁 56-0)