• 沒有找到結果。

Proposed HDC-Based Digitally Controlled Oscillator

CHAPTER 3 Hysteresis-Delay-Cell-Based Digitally Controlled Oscillator

3.3 Proposed HDC-Based Digitally Controlled Oscillator

By the above proposed delay tunable HDC, we design a 5 MHz low power all-HDC-based DCO as shown in Fig. 3.14. The proposed DCO is partitioned into two tuning stages. The 1st tuning stage composed of HDC1 extends the controllable range of DCO. The 2nd tuning stage, cascading HDC2, is for the delay resolution improvement. Because the targeted frequency is 5 MHz, the total delay of HDCs in 2nd stage has less than 200 ns under any PVT conditions. Furthermore, the delay controllable range in 2nd tuning stage must cover the delay resolution of 1st tuning stage, avoiding false lock in PFTCG, ADPLL [5-8] or ADDLL [14] applications.

The delay resolution of 1st tuning stage is summation the summation of propagation delay from low to high (TPLH) and propagation delay from high to low (TPHL) of HDC1. The architecture of HDC1 is illustrated in Fig. 3.15. Based on the Sarawi HDC, we apply an extra transistor mp4 as the header. The ENABLE signal is used for isolating the redundant delay elements in the closed loop and saving the power consumption. Generally, the dynamic power Pdym in the 1st tuning stage is expressed as

Fig. 3.14. Architecture of the proposed HDC-based DCO.

Fig. 3.15. Delay element of the 1st tuning stage.

f V C

Pdym = L DD2 (3-29)

where CL is the overall loading capacitance and f is the circuit operating frequency.

When we don’t disable the redundant delay elements outside the closed loop in 1st tuning stage of DCO, the power becomes

D

where M is the total number of HDC1, N is the number of HDC1 in the closed loop, Ccell and TD are the capacitance and delay value of one HDC1, respectively. When the ENABLE signal turns off the redundant delay, the dynamic power is written as

D

The dynamic power with disabled redundant elements is N M times of the power with unblocked the elements. In other words, the power consumption with disabled redundant elements is independent of N. It also means that the 1st tuning stage power consumption is fixed as shown as (3-31) whatever DCO operating frequency is. Consequently, the power and delay characteristics of HDC1 imply the overall 1st tuning stage power performance.

The 2nd stage delay element HDC2 is the same as the proposed delay tunable HDC in Section 3.2, which provides both delay resolution and delay offset. For covering the delay resolution of 1st tuning stage, the 2nd tuning stage must have enough controllable range. Thus, the number of 2nd tuning stage element increases to 64. Table 3.5 summarizes the control code length, controllable range and delay resolution of the 5 MHZ all-HDC-based DCO.

For hundred-MHz DCO application, we can apply the same DCO architecture as shown in Fig. 3.14. The HDC1 in the 1st tuning stage can be changed with the small delay value cells, like AND logic-gate cells, for decreasing the delay value and increasing the operating frequency. The 2nd tuning stage still applies the cascading delay tunable HDC for preserving the resolution. Table 3.6 shows the simulation results of modified 200 MHz HDC-based DCO. The code length is 14 bits with 6 bits in 1st tuning stage and 8 bits in 2nd tuning stage. The LSB delay resolution is still 0.78 ps.

Table 3.5. Controllable range and delay resolution of 5 MHz HDC-based DCO.

1st Tuning Stage 2nd Tuning Stage

Code Length (bits) 7 13

Range (ns) 412.201 6.362

Resolution (ps) 3246 0.78

Table 3.6. Controllable range and delay resolution of 200 MHz HDC-based DCO.

1st Tuning Stage 2nd Tuning Stage

Code Length (bits) 6 8

Range (ns) 10.09 0.199

Resolution (ps) 160 0.78

3.4 Simulation Result

Based on the low power and high delay resolution for WBAN application [3], the proposed HDC-based DCO is verified and implemented in the standard process 90 nm high threshold voltage (SPHVT) CMOS technology in both 5 MHz and 200 MHz operating frequencies. The power consumption is 2.6 µW at 5 MHz and 14.3 µW at 200 MHz under 1.0 V supply voltage, respectively. The reason for larger power at 200 MHz than 5 MHz DCO is the poor energy efficiency of standard AND logic-gates in 1st tuning stage. The LSB resolutions of both DCOs with delay tunable HDC are 0.78 ps. The designed 5 MHz DCO requires 20 bits control word length and the range of operating frequency is from 1.9 MHz (526.3 ns) to 9.4 MHz (106.8 ns).

The other DCO, operating at 200 MHz, has 14 bits codeword and operating range is from 69.8 MHz (14.3 ns) to 249.4 MHz (4.01 ns) with 0.78 ps delay resolution.

Note that it is easy to extend the controllable range of these DCOs by changing the HDC numbers or using other small delay cells in the 1st tuning stage. In order to fine-tune the delay resolution, the equivalent transconductance of delay tunable HDCs in the 2nd tuning stage can be easily controlled as well.

Table 3.7 lists the overall comparison to the state-of-the-art DCOs. The proposed DCO has the least power dissipation compared with other designs, and also achieves high delay resolution. As a result, the proposed HDC-based DCO indeed has the benefits of better resolution, operation range and delay linearity for low power applications.

Table 3.7. Performance comparison of DCO.

Process 90nm 90nm 90nm 0.18µm 0.13µm 0.13µm 0.35µm Supply

*Power consumption estimated from 50% of ADPLL

3.5 Summary

In this chapter, we introduce a low power, small area and high delay resolution DCO by the HDC. Compared with the standard cells, the proposed HDC not only has the low power and small area feature, but also achieves high delay resolution with linearity. With the aid of the proposed HDC, the 5 MHz DCO has 0.78 ps LSB delay resolution and only consumes 2.6 µW under 1.0 V. Another proposed design of 200 MHz DCO can provide 0.78 ps resolution and 14.3 µW under 1.0 V supply voltage in the standard process 90 nm CMOS technologies, which consumes the least power dissipation of the state-of-the-art DCO.

As a result, this work enables 97.6 % power reduction and 99.6 % area reduction in comparison with the previous DCO in Chapter 2 under 1.0 V. In terms of the all-digital PFTCG, the overall power and area reduction are 73.0 % and 47.6 %, respectively.

(a) (b)

Fig. 3.16. PFTCG comparison (a) power (b) area.

CHAPTER 4

PVT Tolerance Clock Generator

In general, the quartz crystal oscillator is the familiar solution to reference clock source in communication systems. For WBAN applications, the requirements of clock source are low power, low cost and small area, especially in WSN. Although the quartz crystal oscillator can provide good stability under different PVT variations, the milli-watt power consumption [26], large area and extra board component bonding are the fatal disadvantages. The additional board components also result in the difficulty in system integrations and increase the manufacturing cost. The silicon micro-electro-mechanical systems (MEMS) [12] have been proposed to replace the quartz crystal oscillator. But, the extra CMOS processes, wafer level packaging technologies and long manufacturing duration increase the cost and the time to market (TTM) as well. Furthermore, in the quartz crystal oscillators and MEMS approaches, the frequency accuracy would decay when operating time increases. The generated clock is incapable of calibrating by the system.

In system level, the ring oscillator seems to a better solution to chip integration, power budget and area. A 7 MHz ring oscillator [13] has been proposed by adding a band-gap voltage regulator, temperature and process compensation circuits and a

comparator. However, the operational amplifiers in voltage regulator and comparator consume large power. Moreover, when the CMOS technology scales to the next advanced generation, the PVT variations become worse as shown in Fig. 1.4. The violent frequency variation rate of 5 MHz ring oscillator is about 60 % under the worst case PVT corners.

For the low power and integrable clock source applications, such as WBAN, the design challenge is against the serious PVT variations. In the following sections, we describe a new methodology to generate a stable and low power clock source under any PVT variations. The proposed design also has frequency tuning capability to fine-tune the clock frequency and avoid the frequency drift in the long service life.

The design specification is 5 MHz which is as same as the PFTCG reference clock source.

Fig. 4.1. Architecture of the proposed PVT tolerance clock generator.

4.1 Architecture

The proposed PVT tolerance clock generator is shown in Fig. 4.1. It is composed of three blocks, including PVT detector, mapper and clock oscillator. The PVT detector can extract the delay information from different PVT conditions. The mapper

transfers the information to a digital codeword for calibrating the PVT variations. The clock oscillator receives the digital codeword from mapper and generates the target frequency clock.

The process parameters are provided by the standard chip testing procedure and stored in one time programming (OTP) devices to calibrate the process variation. The process calibration behavior can be executed on the mapper, or on both PVT detector and mapper. The frequency tuning command feedbacks from system frequency recovery loop for fine-tuning the clock frequency [2].

4.2 Circuit Designs

4.2.1 PVT Detector

The PVT detector senses the PVT conditions and transfers the response of delay information to a digital code. The delay information is extracted by delay cells which have different PVT sensitivities. Suppose there are two different delay cells in the PVT detector, namely the reference delay cells and variable delay cells. These two delay cells with different PVT sensitivities result in different delay variation rates under different PVT conditions. The PVT detector can observe the relative delay variation between the reference cells and variable cells by the delay ratio

)

where DVAR(P,V,T) is the delay of a variable cell and DREF(P,V,T) is the delay of a reference cell. Both delay values depend on the PVT conditions, so the delay ratio R(P,V,T) is a function of PVT as well.

Fig. 4.2 shows the delay ratio of variable cell (ND4M0H_L) to reference cell (BUFM8H) versus absolute delay under different PVT corners. The ND4M0H_L is a ND4M0H cell with another ND4M0H in the output loading. The ND4M0H and BUFM8H cells are both standard cells in UMC 90 nm SPHVT technology. The cell delay value results from a step input. In Fig. 4.2, the x-axis is the delay ratio R(P,V,T) and the y-axis is the absolute delay value. The three groups of data are the simulation results on the three process corners (FF, TT, SS). Each group covers different simulating voltage (0.9 V ~ 1.1 V) and temperature (0℃ ~ 125℃) variations.

0.22 0.24 0.26 0.28

0.1 0.15 0.2 0.25 0.3 0.35

Delay (ns)

R

SS

TT

FF D

REF

Fig. 4.2. Delay ratio of ND4M0H_L to BUFM8H.

The relation between delay ratio and absolute delay value of reference cell can be modeled a one-to-one mapping function under fixed process variation. Thus, the delay variation under certain process condition is approximated to a second order curve, which is written as

c

where a, b and c represents the process variation coefficients. These process variation parameters can be obtained and stored from the chip testing procedure. Then, the second order modeling error is expressed as

) )

The simulation results of the second order modeling curves are shown in Fig. 4.3 (a). The modeling error is limited by the PVT sensitivity between the reference cells and variable cells, which is the vibrations of the curves as shown in Fig. 4.3 (b). The maximum modeling error is about 2.05 % as shown in Fig. 4.4.

0.22 0.24 0.26 0.28 Fig. 4.3. Second order modeling curve of delay value.

0.22 0.24 0.26 0.28

Max. Modeling Error = 2.0483 %

SS TT FF

EModel

Fig. 4.4. Second order modeling error.

For implementation, we have to partition the delay ratio into several intervals and map the digital codeword of delay ratio into the real delay value. In the i-th delay ratio region, we can map these delay ratios into a delay value DPartition,i.

c

where Ri,MIN is the minimum delay ratio in the i-th region and Ri,MIN is the maximum delay ratio in the i-th region. DModel,i,MAX and DModel,i,MIN are the maximum and minimum modeling delay in the i-th delay ratio region, respectively. RPartition,i is the corresponding delay ratio in the i-th region. The frequency error after partition is written as

Fig. 4.5. Architecture of the proposed PVT detector.

Fig. 4.5 describes the proposed architecture of PVT detector. The PFD is the phase frequency detector whose architecture is the same as Fig. 2.4. Each PFD connects the delay lines composed of reference cell and variable cell with different cell numbers. In the beginning, a step input (ENABLE) triggers the PVT detector.

Then, the step pulse passes through the two type delay lines. Each PFD detects the lead or lag between each pairs of delay lines and generates the up and down signals.

According to these up and down signals, we can estimate the delay ratio of single reference cell to single variable cell.

The function of encoder includes the transformation from up/down signals into divided delay ratio and the mapping from the delay ratio to the real delay time of reference cell with the process variation parameters as (4-6). Fig. 4.6 shows the simulation results of PVT detector partition curves. In a certain separated delay ratio region, all delay ratio values are mapped into a fixed delay. Fig. 4.7 shows the partition error under different PVT conditions. The maximum partition error is about 2.03 %.

0.21925 0.24924 0.27986 Fig. 4.6. Second order partition curve of delay value.

0.219250 0.24924 0.27986

Max. Partition Error = 2.0258 %

SS TT FF

EPartition

Fig. 4.7. Second order partition error.

4.2.2 Mapper

The mapper transfers the PVT information, the absolute delay value of reference cell, to a digital control code of the oscillator with the process variation parameters.

That is to say, the mapper converges the 3-dimension variations, including process, voltage and temperature, towards 1-dimension oscillator codeword. By (4-2), the delay of an oscillator cell can be modeled as

c

where M is the mapping function of the mapper, and a′, b′ and c′ are the modified process variation coefficients. Then, the clock oscillator cell number in the closed loop is express as

c

where f is the target frequency of the generated clock from clock oscillator. The oscillator control codeword can be regarded as clock oscillator cell numbers with an offset in c′′. Hence, we summarize (4-9) as

c

Thus, we can combine the encoder in PVT detector into the mapper. In other words, the mapper would transfer the delay ratio from PVT detector to oscillator codeword by modified process variation parameters as

c

The mapper also receives the frequency tuning command from system frequency recovery loop [2]. The frequency tuning command controls the oscillator codeword and fine-tunes the output clock frequency.

d

where d is the tuning step from frequency tuning command. The process parameters a′′, b′′ and c′′′ can be acquired and stored in the OTP devices. The error after mapper is written as

0.219250 0.24924 0.27986

Fig. 4.8. Second order mapping curve of oscillator codeword.

0.219250 0.24924 0.27986

Max. Mapping Error = 1.919 % EMapping

Fig. 4.9. Second order mapping error.

) )

There are simulations of mapping curves in Fig. 4.8. The oscillator cell, CKINVM8H, is also the standard cell in UMC 90 nm SPHVT. The trends of mapping curves are invert with the partition curves as shown in Fig. 4.6. Larger delay ratio implies less delay of oscillator cells, resulting in larger oscillator cell numbers for the same clock period. Fig. 4.9 depicts the mapping error under different PVT conditions.

The mapping error depends on the partition error, the PVT sensitivity of oscillator cell and clock oscillator resolution. The maximum mapping error is about 1.92 %.

4.2.3 Clock Oscillator

In clock oscillator, the PVT sensitivity and delay resolution affect the mapping error. The degree of vibrations in mapping curve, as shown in Fig. 4.8(b), results from the PVT sensitivity of oscillator cells. Less PVT sensitivity would increase the stability of clock oscillator and reduce the mapping error in mapper. High delay resolution is also required for improving the frequency accuracy in clock oscillator.

The block diagram of clock oscillator is shown in Fig. 4.10. There is a delay line composed of oscillator cells. The path selectors are constructed from several tri-state buffers [5]. The oscillator encoder encodes the oscillator codeword (CodewordMapping) to the control signals (ON) to tri-state buffer. The output in tri-state buffers is divided by 8 and generates the target 5 MHz clock. Although these tri-state inverters bring out extra delay uncertainty in the clock oscillator under different PVT conditions, the delay uncertainty can be ignored in the long delay line. As a result, the non-ideal effect can be also calibrated by the process parameters and frequency tuning command as (4-12).

Fig. 4.10. Architecture of the proposed clock oscillator.

4.3 Simulation Result

The simulation results of the proposed PVT tolerance clock generator are shown in the above sections. The PVT detector applies 84 pairs delay detector circuits and estimates the delay ratio of a reference cell (ND4M0H_L) to a variable cell (BUFM8H). The 84 variable cell delay lines have 292 ~ 375 cell numbers, respectively. The reference cell delay lines all have 82 cell numbers. We separate the delay ratio into 83 intervals. The maximum partition error is 2.03 %. After the mapper and clock oscillator, the frequency error of output clock is less than 1.92 % without frequency tuning command. Then, the frequency error can be reduced to 20 ppm by system frequency calibration loop [2].

The response time of proposed PVT tolerance clock is less than 100 ns due to the delay lines in PVT detector. After system reset, the PVT detector is triggered once and then records the present PVT information until the frequency re-tracking command from system by violent PVT variations. The power consumption of PVT detector is about 15.39 mW during the responding 100 ns. Because the PVT detector only operates once, the power can be easily reduced by extending response time.

After detection, the stable power consumption of the proposed PVT tolerance clock generator is 343 µW at 5 MHz under 1.0 V supply voltage from the clock oscillator circuits.

4.4 Implementation

The proposed all-digital and cell-based 5 MHz PVT tolerance clock generator is implemented with UMC 90 nm technology. Fig. 4.11 summarizes the area distribution.

The PVT detector, mapper and clock oscillator occupy 89.81 %, 3.66 % and 6.52 % of overall area, respectively. The overall area is about 0.28 mm2. Fig. 4.12 is the layout view of the proposed design. The non-blocked part in Fig. 4.12 is the other design integrated with our proposed PVT tolerance clock generator.

Fig. 4.11. Area distribution of PVT tolerance clock generator.

Fig. 4.12. Layout of the proposed PVT tolerance clock generator.

Table 4.1. Performance comparison of clock sources.

Process 90nm With 90nm

Oscillator Pad

Quartz Crystal) 2.25 1.6

Max. Frequency Error (%)

1.9 / 0.002

(FCL+DPR[2]) 0.005 0.004 2.6

Frequency Tunable Yes No No No

Table 4.1 lists the comparison among different clock generators. The proposed PVT tolerance clock generator has less power consumption and less area than quartz crystal oscillator [26], MEMS [12] and ring-oscillator-based design with calibration circuits [13]. By the means of frequency tuning capability from frequency calibration loop (FCL) and DFR [2], the proposed design has similar maximum frequency error to the quartz crystal oscillator [26] and MEMS [12] approaches. When the service time increases, the frequency tuning command provides immediate frequency calibration capability to avoid frequency drift. We can always fine-tune the frequency of clock generator to enhance and hold the clock frequency accuracy.

Moreover, theses two approaches, quartz crystal oscillator [26] and MEMS [12], have extra manufacturing costs and difficulties in system integration. The proposed design is all-digital and integrable with system under standard CMOS technology.

Though the ring-oscillator-based design [13] overcomes the PVT variations in 0.25 µm with calibration circuits, the design complexity for power minimization due to band-gap voltage regulator is the major challenge in deep sub-micron CMOS process.

4.5 Summary

For replacing the quartz crystal oscillator, we propose a new method to design a

For replacing the quartz crystal oscillator, we propose a new method to design a

相關文件