• 沒有找到結果。

Chapter 1 Introduction

1.3 Dissertation Organization

This dissertation is organized as follows. Chapter 2 describes the proposed architecture and circuit of high-resolution and ultra-low-power DCO. The proposed DCO and HDC can be applied to the following clock generators. In Chapter 3, the general binary search-based ADPLL is discussed and the proposed TDC-based ADPLL for fast-lock-in demand is presented. Chapter 4 focuses on the proposed ADSSCG employs a novel rescheduling division triangular modulation (RDTM) to enhance the phase tracking capability and provide wide programmable spreading ratio.

And the auto-adjustment algorithm for monotonic delay characteristic also has been proposed . In Chapter 5, the proposed tunable phase shift scheme based on ADPLL

- 9 -

for DDR controller application is presented. Chapter 6 describes the proposed ADSMD employs a delay-matching structure and a high-resolution delay cell to achieve small static phase error and an edge-trigger mirror delay cell to extend input duty cycle range. Finally, the conclusions and future works are given in Chapter 7.

- 10 -

Chapter 2

Low-Power Digitally Controlled

Oscillator with Hysteresis Delay Cell

2.1 Introduction

Digitally controlled oscillator (DCO) and digitally controlled delay line (DCDL) is the most important module in ADPLL/ADSSCG and ADDLL/ADSMD respectively. The delay cells are used to construct a ring oscillator in ADPLL/ADSSCG and a delay line in ADDLL/ADSMD. In this chapter, the high-performance and low-power delay cell will be described first, and the follow-up works will focus on the DCO architecture design with the proposed delay cell.

Basically, digitally controlled oscillator (DCO) dominates the major performances of the all-digital clock generators such as power consumption and jitter, and hence is the most important component of such clocking circuits [1], [10]-[14]. In terms of power, DCO occupies over 50% power consumption of an all-digital clock generator [1]. For example, the DCO occupies 59% power consumption of an all-digital phase-locked loop (ADPLL) as shown in Fig. 2.1. As a result, the power consumption of DCO should be reduced further to save overall power dissipation to meet low-power demands in SoC designs. Besides, the resolution of DCO has large influences on jitter performance and frequency or phase error of output clock.

- 11 -

Furthermore, if DCO can provide wide operating frequency range, it can extend the output frequency range of all-digital clock generator for the wider applications.

Recently, different architectural solutions have been proposed to implement the DCO. The current-starved type DCO [15] controls the supply current of delay cell to obtain different delay values. Although it has high resolution, it needs a static current source that will consume more static power dissipation. The LC tank DCO [16] can also achieve high delay resolution, however, it needs advanced process and requires intensive circuit layout. These approaches demand high complexity at circuit level, resulting in long design cycle and low portability.

In order to reduce design cycle when process or specification is changed, many DCO’s implemented with standard cells have been proposed to enhance portability [1], [11], [17], [18]. Driving capability modulation (DCM) changes the driving current of each delay cell by controlling number of enabled tri-state buffers/inverters [1], [17]. The design concept of this approach is straightforward, but it has a poor performance in linearity and power consumption, and the resolution is insufficient.

DCO 59%

Controller 31%

Phase/Frequency Detector (PFD) 10%

Fig. 2.1: Power profiling of ADPLL

- 12 -

The or-and-inverter (OAI) cells are proposed to enhance resolution by different input pattern combinations; however linearity remains to be solved [11]. Although digitally controlled varactor (DCV) has a good performance in resolution and linearity [18], it is hard to take a few cells to provide wider operation range. As a result, large power consumption is demanded due to many DCV cells to maintain an acceptable operation range. The brief summary of the different DCO approaches is listed in Table 2.1.

Thus, we attempt to propose a low-power, high-resolution, and wide-range DCO with high portability. Because the applications of our research focus on the general µp-based systems and communication baseband processors, the frequency operating range of the proposed DCO should be extended easily, and the maximum operation frequency of DCO would not be higher than 1GHz. In addition, the design target of power saving is an-order power reduction of the conventional works while keeping high delay resolution. However, because we want to propose a cell-based DCO design,

Table 2.1: Comparisons of Different DCO Approaches

Performance Indices

Driving capability modulation (DCM)

[1], [17]

Or-and-inverter (OAI) cell [11]

Digitally controlled varactor (DCV) [18]

Resolution Poor High High

Power High Medium High

Linearity Poor Poor Good

Operation Range Wide Narrow Narrow

- 13 -

how to overcome the limitations of the standard cells to build up such low-power, high-resolution, and wide-range DCO are the important design challenges for our research.

This chapter is organized as follows. Section 2.2 describes the proposed hysteresis delay cell. Section 2.3 describes the proposed architecture and circuit of DCO. And how to reduce power consumption of DCO is also presented in this section.

Section 2.4 discusses and analyzes the performance comparison results of the different DCO structures. In Section 2.5, the implementation and measurement results of the fabricated DCO chip are presented. Overall performance comparison with the state-of-the-art DCO’s is also listed and discussed. Finally, a brief summary is addressed in section 2.6.

2.2 Hysteresis Delay Cell

Because DCO/DCDL usually utilizes many delay cells to generate the desired clock output, how to design a low-power delay cell is an important design issue in all-digital clock generator design. The delay cell should provide suitable and controllable delay value with low power and hardware penalty. Thus, the proposed hysteresis delay cell (HDC) which can reduce the gate count and loading is very suitable for all-digital clock generator applications. Fig. 2.2(a) illustrates the proposed HDCs used in the DCO and each of which contains one inverter (INV2) and one tri-state inverter (TINV). As the input state of control signal (F1ON [0] ~ F1ON [P-1]) of TINV in HDC changes, different delay can be obtained. The operation concept of HDC is to control driving current to obtain different propagation delay. When TINV of the HDC is enabled, the output signal of enabled TINV has the hysteresis

- 14 -

phenomenon in the transition state to produce different delay times from the delay chain. Fig. 2.2(b) illustrates the equivalent circuit of HDC for analysis. The propagation delay Tp from N1 to N2 is a function of loading capacitance and equivalent resistance of turn-on MOS [19] and is given by NMOS and PMOS in the driving inverter (INV1) respectively. In the general operating situation, CL remains as a constant value. But, the equivalent resistance of turn-on MOS in INV1 varies with saturation current and drain-source voltage and is expressed by

Fig. 2.2: (a) Proposed HDC. (b) Equivalent circuit of HDC for analysis.

- 15 -

where IDSAT is the saturation current of transistor device. When TINV is enabled, since the input signal of TINV (N3) does not vary with the input of INV1 (N1) instantaneously, it will sink the inverse current I2 to reduce the effective driving current from I1 to I3.This leads to enlarge delay time of the delay chain. Fig. 2.3 shows the hysteresis phenomenon of this HDC, where input signal transition is observed from SPICE simulation. In the beginning, N1 and N3 remain at high level and N2 is at low level. As N1 signal level changes from high to low, the signal level of N2 attempts to vary from low to high. However, because N3 remains at high level for a while

N1

Fig. 2.3: Hysteresis phenomenon of HDC.

Controlled Voltage of TINV (V)

Delay (ps)

Fig. 2.4: The relation among input voltage of TINV, effective driving current, and INV1 delay.

- 16 -

(delayed by INV2), TINV sinks the inverse current to slow down the pull-high speed of N2. Thus, (2.2) should be rewritten as follows

The effective driving current changes from I1DSAT to I1DSAT – I2DSAT as TINV is enabled.

The relation among input voltage of TINV, effective driving current, and INV1 delay is shown in Fig. 2.4. As the input voltage of TINV increases, the effective driving current of INV1 will decrease, leading to enlarge the delay of inverter chain. In addition, based on the different driving capability tri-state inverters in a given cell library, a set of different delay steps of HDC can be constructed for a specified DCO requirement.

2.3 The Proposed DCO Architecture

Fig. 2.5 illustrates the architecture of the proposed ultra-low-power DCO. Based on standard cells, our proposal can save power consumption and keep resolution. To preserve the control code resolution and operation range, the proposed DCO employs

COARSE-TUNING STAGE

Fig. 2.5: Architecture of the proposed DCO.

- 17 -

cascading structure for both coarse-tuning and fine-tuning stages to maintain control code-to-delay linearity and extend operation range easily. Two low-power circuit design techniques are proposed here. First, the proposed segmental delay line (SDL) can disable the transition of redundant segmental delay cells which is a two-input AND gate in coarse-tuning stage at target operation frequency. Second, the hysteresis delay cell (HDC) is proposed for fine-tuning stage to reduce the number of short-delay cells.

2.3.1 Coarse-Tuning Stage

Fig. 2.6 shows the proposed segmental coarse-tuning stage, which is composed of 2M-1 two-input AND gates that form a SDL and a path-selection multiplexer. It can provide 2M different delay values by selecting different delay paths organized by these

2M-1 two-input AND gates. In the conventional delay line of path-selection schemes [11], [12], [18], the delay cell is composed of two inverters. When delay line is requested to provide higher operation frequency, a shorter delay path is selected and

COARSE-TUNING STAGE

EN[0] = 1 EN[1] = 1 EN[2] = 0

Path-Selection MUX RESET

DECODER Coarse [M-1:0]

DCO_OUT

Disabled Cells

Selected Delay Path

P0 P1 P2 P3 P2M-1

EN[2M-2] = 0

EN[2M-2:0]

F_IN

Segmental Delay Line

Fig. 2.6: Proposed segmental coarse-tuning stage with SDL.

- 18 -

the rest delay cells will not be used. However, these delay cells are not disabled. To reduce power consumption as the operating frequency changes, some enabling input controlled signals (EN [2M-2:0]) are set to low level to disable those redundant two-input AND gates.

2.3.2 Fine-Tuning Stage

Because the resolution of the above mentioned coarse-tuning stage is not sufficient for typical DCO applications, a fine-tuning stage is added. In order to achieve better resolution and less power consumption, this fine-tuning stage is divided into three different sub-stages as shown in Fig. 2.7. It should be noted that the controllable range of each stage is larger than the delay step of the previous stage. As a result, the cascading DCO structure does not have any dead zone larger than the LSB resolution of DCO. The delay steps of these fine-tuning sub-stages are different;

delay cells of the 1st stage and 3rd stage have the largest and smallest delay step, respectively. Therefore, delay cell of the 3rd fine-tuning stage determines the DCO

F_IN

Fig. 2.7: Proposed fine-tuning stage with HDC and DCV.

- 19 -

LSB resolution and controllable range of the 1st fine-tuning stage can cover the delay step of the coarse-tuning stage easily. Since the proposed HDC can provide larger delay step than DCV, the 1st fine-tuning stage employs P HDCs to replace many DCV cells, leading to save power consumption. Due to better resolution capability, different DCVs are exploited in the 2nd and 3rd fine-tuning stages to improve the overall resolution of DCO. The operation concept of DCV is to control the gate capacitance of logic gate with input state to adjust the delay time [12], [18]. The 2nd and 3rd fine-tuning stages employ Q long-delay DCV cells (two-input NAND) and R short-delay DCV cells (tri-state inverter) respectively.

To optimize both power consumption and resolution, a strategy of allocating the proportion of the sub-stages in the proposed fine-tuning stage is introduced. First, in order to achieve high operation frequency, P should be limited to enlarge the length of total delay line in the fine-tuning stage. Then a suitable delay step of HDC can be determined by P. Second, because the delay resolution is only determined by the delay step of DCV in the 3rd fine-tuning stage, it needs to select a short-delay DCV from the cell library to meet the resolution requirement. After delay step has been determined, R can be chosen for the range of the 3rd fine-tuning stage and the loading capacitance consideration. Finally, after the delay step adjustment of HDC and short-delay DCV, the delay step of long-delay DCV and Q in the 2nd fine-tuning stage can also be determined. Note that Q can be reduced significantly by exploiting HDC to save power. For example, if the requirement of output delay is 260ps, it uses 4 HDCs to cover such delay range and 8 short-delay DCV cells to achieve high resolution. By the final step, 32 long-delay DCV cells are utilized to form the 2nd fine-tuning stage. As a result, total power consumption and resolution of the proposed

- 20 -

fine-tuning stage is 40.28µW and 0.97ps respectively under 200MHz and 0.8V in a 0.13µm CMOS process.

2.4 DCO Performance Comparisons

2.4.1 Coarse-Tuning Stage Performance Comparisons

For performance comparison, we rebuild those published approaches with an in-house 0.13µm CMOS standard cell library and then compare with our proposal.

Because the DCO consists of coarse and fine tuning stages in general, the performance comparisons are divided into two parts as well.

In the coarse-tuning stage, we reconstruct the conventional delay line of path-selection type by two-inverter delay cells for power consumption comparisons.

For fair comparisons, both conventional and the proposed segmental coarse-tuning stages have the same operation range. In terms of different operation frequencies, the

0

Fig. 2.8: Power comparisons of different coarse-tuning designs.

- 21 -

simulation results of power consumption are shown in Fig. 2.8. As compared with conventional approaches, the proposed segmental coarse-tuning stage can reduce 70%

and 25% of the power consumption at 500MHz and 200MHz respectively. Because the number of disabled redundant delay cells varies with different operation frequencies; the segmental scheme has different power reduction ratio in different operation frequencies.

2.4.2 Fine-Tuning Stage Performance Comparisons

The fine-tuning stage determines many major performance indices of DCO, such as LSB resolution, delay linearity, and power consumption. Therefore, the performance comparisons of fine-tuning stage focus on these important performance indices. In the cell-based design approach, many designs exploit DCM or DCV to construct fine-tuning stage [1], [12], [17], [18]. For fair comparisons, these designs are rebuilt under the similar operation range, delay resolution, and number of control bit. To ensure correct functionality, the operation range of fine-tuning stage in all comparison candidates should be larger than the minimum delay step of two-input AND gate, which is 200ps in an in-house 0.13µm standard cell library. The rebuilt fine-tuning stages by different design approaches are: DCM type (Approach I) [1],

Table 2.2: Performance Comparisons with Different Fine-Tuning Stages

Resolution (ps)

Total Power (µW)

Partial Power*

(µW) Gate Count Range (ps)

Proposed 0.97 40.28 36.31 48 261.34

Approach I 4.28 291.59 - 256 263.66

Approach II 1.07 233.61 228.77 128 266.9

Approach III 0.97 105.29 98.89 80 260.38

* Power consumption of long-delay stage

- 22 -

[17], DCV type (Approach II) [18], and combination of DCM and DCV type (Approach III) [12]. The operation frequency range should be similar for fair comparisons, resulting in the different number of delay cells in different structures.

For example, Approach I, Approach II, and Approach III utilize 256, 128, and 80 tri-state inverters, respectively. In contrast to these approaches, the proposed structure only needs 12 tri-state inverters, 4 inverters, and 32 two-input NAND gates (based on the strategy mentioned in subsection 2.3.2 with P, Q, and R are assigned to 4, 32, and 8 respectively).

The performance comparisons simulated at 200MHz at 0.8V and typical corner cases, are summarized in Table 2.2. Note that all of them have the similar performance in LSB resolution except Approach I. But, in terms of power consumption and area, the proposed design has significant improvement. Since the proposed HDC can replace many DCV cells to obtain wider operation range, the number of delay cells connected with each driving inverter and loading capacitance can be reduced, leading to save power consumption and gate count as well. The

0

Fig. 2.9: Power and resolution comparisons of different fine-tuning designs.

- 23 -

reduction ratios are 86.2%, 82.8%, and 61.7%, as compared with Approach I, Approach II, and Approach III, respectively. Fig. 2.9 also shows that our proposal has the high LSB resolution and low-power features as compared with the other designs.

Except Approach I, all of comparison candidates employ a short-delay DCV cell to form the finest delay cell; however, they utilize different type long-delay stages.

Thus, we focus on the power comparison of long-delay stage in different approaches.

In contrast to Approach II whose long-delay stage only utilizes long-delay DCV cell, our proposal exploits HDC and hence has less long-delay DCV cells compared with Approach II. As a result, power-to-delay ratio of long-delay stage of our proposal and Approach II is 0.14µW/ps (36.31µW/261.34ps) and 0.86µW/ps (228.77µW/266.9ps) respectively. Based on this power comparison, it is clear that HDC-based structure

Fig. 2.10: Microphotography and layout of DCO test chip.

Table 2.3: Measurement Results of Step/Range of Tuning Stage

Coarse-Tuning 1st Fine-Tuning 2nd Fine-Tuning 3rd Fine-Tuning Range (ps) 3726.36 296.74 116.02 10.26

Step (ps) 120.21 98.91 3.74 1.47

- 24 -

can provide better power-to-delay ratio than pure DCV type structure, implying HDC is more effective in power saving for a given delay.

2.5 Experimental Results and Comparisons

Based on the requested frequency range and resolution for our application, the design parameters of the proposed DCO are determined as follows: N=10, M=5, P=4, Q=32, and R=8. In order to verify the feasibility and performance of the proposed DCO in advanced processes, a test chip has been fabricated in 90nm 1P9M CMOS process, where the chip microphoto and layout of the DCO chip is shown in Fig. 2.10.

The DCO output signal is measured using LeCroy SDA4000A at 1V/25°C (supply of I/O pad is 2.5V) to test the performance. Due to the speed limitation of I/O pad, the DCO output frequency has to be divided by 2 when DCO operates at high frequency.

Table 2.3 shows the delay step and operation range of different tuning stages in the proposed DCO. It shows that the controllable range of each stage is larger than the step of the previous stage, and the average DCO resolution is 1.47ps. Fig. 2.11 shows the comparison between measurement results and post-layout simulation to illustrate

Fig. 2.11: Comparisons of measurement and post-layout simulation results.

- 25 -

the linearity analysis of the proposed DCO. Both rms and peak-to-peak phase jitter at 417MHz is 8.18ps and 49.05ps respectively. Fig. 2.12 shows the rms and peak-to-peak phase jitter is 8.24ps and 49.95ps respectively over 150,000 sweeps at 952MHz under 1V and 60mV supply noise.

Table 2.4 lists comparison results with the state-of-the-art DCOs. In terms of power consumption, the proposed DCO has the lowest power consumption compared

Fig. 2.12: Jitter histogram of DCO at 952MHz.

Table 2.4: DCO Performance Comparisons

Performance Indices Proposed DCO JSSC'05 [15] TCAS2'05 [18] JSSC'04 [1] JSSC'03 [11]

Process 90nm CMOS 0.18µm CMOS 0.35µm CMOS 0.35µm CMOS 0.35µm CMOS

Supply Voltage (V) 1 1.8 3.3 3 3.3

DCO Control Word Length

15 5 15 7 12

Operation Range (MHz)

191 ~ 952 413 ~ 485 18 ~ 214 152 ~ 366 45 ~ 510

LSB Resolution (ps) 1.47 2 1.55 10 ~ 150 5

Power Consumption 140µW (@200MHz) 340µW (Static only) 18mW (@200MHz) 12mW (@366MHz) * 50mW (@500MHz) *

Portability Yes No Yes Yes Yes

* Power consumption calculated from 50% of ADPLL [2].

- 26 -

with other DCO designs. Furthermore, the proposed low-power solution does not induce any performance loss. Additionally, since the proposed DCO can be implemented with standard cells, it has a good portability. As a result the proposed DCO has the benefits of better resolution, operation range, linearity, and portability.

2.6 Summary

In this chapter, we have proposed a hysteresis delay cell an ultra-low-power

In this chapter, we have proposed a hysteresis delay cell an ultra-low-power