• 沒有找到結果。

Chapter 2 Design Challenges of DLL-Based Clock Generator

2.2 The Operation of the DLL-Based Clock Generator

2.3.4 Project Design Concepts

The DLL-based clock generator applied in the power management system needs some properties such as wide output frequency range and multiplicity of the multiplication. When operating in a wide operating frequency range, designers must to overcome the locking state issue and static phase error issue. The multiplicity of the multiplication needs designers to determine which methods of multiplication and how many numbers of multiplication factors.

From previous discussion, we know that the lock detector is not suitable for the VCDL last bit switching application. In this project, we use the start up circuit proposed in [7] to set the system initial conditions for the DLL can always lock into correct lock state and widen the DLL operating range.

In order to produce more numbers of multiplication factors, we use the N/2 scales multiplication method edge combiner. The edge combiner proposed in [17] is suitable for the project. According to the switching patterns, we design a controller

blocks in the whole system. Attach a controller, this edge combiner can produce the output frequency Fout = (N/2) Fref. With different control patterns, N is programmable.

The static phase error in this project will be decrease by the action of pulse reshaper circuit proposed in [14]. The pulse reshaper circuit will change the characteristic plot of the PD-CP to decrease the system static phase error. Lighten the effect of static phase error; the clock generator can maintain the output signal performance even if the system operating frequency range is wide.

In the next chapter, we will introduce the system architecture and each block in detail.

Chapter 3

Target Circuit/System Introduction

This chapter starts from the introduction of the system architecture of DLL-based clock generator. After go through the system operation, we will see the details of every building block used in this project.

3.1 System Architecture

In Chapter 2, we know that the traditional architecture in Fig. 2.4 will produce the undesired glitch when VCDL last bit switching. The undesired glitch will make the PFD confused and let the system locked in the harmonic lock state. To overcome this issue, we use multi-PFD-CP pairs architecture proposed in [16]. The modified system architecture is shown in Fig. 3.1. When we fix the inputs of the PFD, the switching between PFD-CP pairs can achieve the last bit switching function and avoid the undesired glitch.

The operation of this architecture is as follow. The startup circuit set the initial conditions of the system to make the DLL lock in the correct lock state and widen the operating frequency range. PFD compares the phase difference of its inputs and sends the up/down signals to the CP. CP convert the voltage signals up/down to the current signal to charge or discharge the LF. The control voltage on LF is control the delay

and edge combiner to vary the multiplication factor. Edge combiner uses the evenly spaced multi-phase signals to produce the multiplied output signal. The detail of each block used in this architecture will discuss in following sections.

Fig. 3.1 The project system architecture of DLL-based clock generator.

3.2 Startup Circuit

The startup circuit proposed in [7] is used to overcome the lock states issue and widen the DLL operating frequency range. The startup circuit composed of two rising edge trigger DFFs, two NAND gates and two inverters. It receives three input signals:

STARTB, REF, and VCDL and produces three output signals: SETUPB, OUT_REF, and OUT_VCDL. STARTB is the external signal to indicate when the system starts.

REF is the external reference signal used in DLL operation. VCDL is the feedback signal in voltage control delay line. Initially, STARTB is set to low in order to clear the two DFFs’ outputs. Therefore, SETUPB is low and active the PMOS to pull the

control voltage to VDD as shown in Fig. 3.2. Because the VCDL delay time is inverse proportion to Vctrl, so the SETUPB initialize the VCDL delay to its minimum value.

DFF DFF

Fig. 3.2 The architecture of startup circuit.

In the beginning, the OUT_REF and OUT_VCDL are in the low level. When STARTB goes to high, SETUPB will follow to high. After two consecutive falling edges of VCDL triggers the DFFs, the OUT_VCDL will be activated and input to the PFD to produce the down signal to discharge the LF, and increase the delay time of the delay stages. The delay will increase until the DLL lock in a reference period delay time. Attach a startup circuit, the delay stages work from the minimum delay to lock to one reference period delay time, the DLL will not fall into false lock even when 10TD,mun <0.5TREF. So, startup circuit also widens the operating frequency range of DLL system. Fig. 3.3 shows the simulation result of the startup circuit, the waveforms of the signals are consistent with the previous introduction.

Fig. 3.3 The simulation waveform of startup circuit.

3.3 PFD Circuit

The schematic of the PFD is shown in Fig. 3.4. Because of the PFD-CP pairs switching function, the PFD must have the enable/disable function. It is accomplished by add a NAND-gate on the feedback path. When enable is high, the positive-edge in either REF signal or VCDL signal, it will trigger to generate corresponding UP or DN signals. The UP or DN signals will send to the CP to produce the charge or discharge currents to the LF. On the other hand, if the enable signal is low, the PFD will not produce any output signals, even if the positive-edge of REF/VCDL is arrival. The simulation waveform is shown in Fig. 3.5. Both the UP and DN pulse generated, they will maintain the high level in a short time simultaneously to reduce the dead-zone of the PFD.

Fig. 3.4 The schematic of PFD circuit.

Fig. 3.5 The simulation waveform of PFD circuit.

3.4 CP Circuit

The use of CP is to deposit or withdraw charges to the LF according to the phase difference determined by the PFD. This is accomplished by time-multiplexing charge pump currents in or out of the LF, and charges are deposited or withdrawn. The schematic of the CP is shown in Fig. 3.6. The reference current is produced from the left path, and uses the current mirror to mirror the current to the output node. All the switches are using the same type of MOS: PMOS. Because of the wide frequency range operation, the current mode CP circuit can suitable in the high speed operation.

The simulation result is combining the PFD-CP to see the characteristic plot. From Fig. 3.7, we can see that the PFD dead-zone is less than 1ps.

Fig. 3.6 The schematic of CP circuit.

Fig. 3.7 The characteristic plot of PFD-CP pair (simulation).

3.5 Pulse Reshaper Circuit

Review from Fig. 2.14, the current mismatch in CP currents will bring about the system static phase error when DLL in the lock condition. This will worsen the output clock signal performance. Unfortunately, when DLL lock in a wide operating frequency range, the control voltage on the LF must vary in a wide voltage range.

Therefore, the CP currents mismatch is inevitable. In this project, we use the pulse reshaper circuit proposed in [14] to change the characteristic plot of PFD-CP pair, and reduce the system static phase error.

Fig. 3.8 shows the PFD with pulse reshaper circuit. The pulse reshaper circuit composed of two inverters and two AND gates. T means the masking window of m which delay is caused by low slew rate of the inverter. The timing diagrams are illustrated in Fig. 3.9. When DLL still unlock, the phase difference between REF and VCDL larger than T , only one RUP or RDOWN signal is at logic high, as Fig. m 3.9(c) shows. While the DLL in the lock state, there is no phase difference between REF and VCDL, both RUP and RDOWN are at logic high and go to logic low

difference less than T , the pulse RUP or RDOWN activated by the late clock has an m increasing voltage value like a glitch, as Fig. 3.9(b).

Fig. 3.8 The architecture of pulse reshaper circuit.

=0

−VCDL

REF REFVCDL<Tm REFVCDL>Tm

Fig. 3.9 The operation of pulse reshaper circuit.

The characteristic plot of PD attach a pulse reshaper can see the effect of reduce system static phase error. The plot is shown in Fig. 3.10. The little high gain range that cross the CP current mismatch line closer to the Y-axis, the static phase error will

also be reduced. This is because as phase difference of REF and VCDL is less than T , the lagged signal produces a glitch-like RUP/RDOWN signal, so the difference m

of charge and discharge current will become larger. Therefore, when the slew of the inverter is almost linear, around the locking point (REF-VCDL <Tm), the voltage

level of the reshaped pulse is inverse proportion to REF-VCDL , increases the gain slope of PD. The simulation result is composed of PFD, pulse shaper, and CP, shown in Fig. 3.11.

π

_ ×2

=clock period

M Tm

Fig. 3.10 The characteristic plot of PD with pulse reshaper circuit.

Fig. 3.11 The characteristic plot of the PFD, pulse reshaper, and CP(simulation).

3.6 Delay Cell

In this project, we use the current-starved inverter type Delay cell to reduce the power consumption. The schematic is shown in Fig. 3.12. The delay cell comprises a single-ended inverter composed of M2 and M with series transistors 3 M1 and

M4 operating in the triode region. The delay time of delay cell is determined by the equivalent resistance of M1 and M4, controlled by V , controlled by the driving C transistors M and 7 M . An additional inverter which is 8 M and 5 M serves as an 6 output buffer which compensates the high frequency attenuation introduced by the preceding delay core. Moreover, the circuit performs a rail-to-rail operation, so it consumes no static power [16]. Because the numbers of delay stages will determine the delay range of the delay line. According to our design goals, we choose eight-stage delay line. We use the spice simulation to see the delay range of the delay cell, and it is shown in Fig. 3.13.

Fig. 3.12 The schematic of delay cell.

Fig. 3.13 The simulation delay range of delay cell.

3.7 Edge Combiner

For power management application, the clock generator must have more numbers of multiplication factors. From chapter 2, we know the multiplication factor is determined by the architecture of the edge combiner. The pulse toggle method edge combiner can provides N/2 scales multiplication, so it is suitable in this project.

The architecture of the edge combiner proposed in [17] is shown in Fig. 3.14.

Bi signal is the output signals of the delay line, Si signal is control signal which determines the production of ki signal. If Bi signal rises, one input of a NAND gate will arrive faster than the other which comes after three-inverter delay. Therefore, at the rising edge of output of buffer, the NAND gate generates the negative narrow pulse ki corresponding to the three-inverter delay. Edge combiner uses the symmetric NAND gate and inverter to form the symmetric AND gate. Use three-stage of the symmetric AND gates to compose all the ki signals to form the A signal. The A signal pass to the TPL circuit to perform the divide-by-2 function, and produces the multiplied output clock signal. Because of the pulse-toggle method, the edge

have 50% duty-cycle. The pattern of the ki signals is in order to increases the operating frequency of the edge combiner. Because the transition overlap of the ki signals may occur in three-stage of the symmetric AND gates. Because we use eight-stage delay line, so the edge combiner can provide 1/2, 2/2… 8/2, total eight multiplication factors which determined by the control signals. The simulation result is shown in Fig. 3.15.

Fig. 3.14 The architecture of edge combiner.

Fig. 3.15 The simulation result of edge combiner.

3.8 Controller

The system need to control the switching of multi-PFD-CP pairs and the edge combiner to determine the multiplication factor. The proposed control pattern is shown in Table 3.1. The pattern is in order to use the simplest logic gates to compose the controller. The numbers of 1 determined the N value which means the multiplication factor. Because the last 1 in Si must feedback to the PFD to compare phase difference with REF. So, the control signals of multi-PFD-CP pairs can also generate in this pattern.

Table 3.1 Project control pattern.

INPUTS OUTPUTS

Each control signal function is shown below:

A

The simulation results are shown in Fig.3.16.

Fig. 3.16 The simulation result of controller.

If we don’t switch the last bit of delay line stages, the DLL’s feedback VCDL will not change, so it doesn’t need to relock. The system can changes the multiplication factor in only one cycle time. The simulation result is shown in Fig.

3.17.

Fig. 3.17 The simulation result of the last bit unchanged case.

Chapter 4

System Simulation Results and Measurement Results

Each block of the project and simulation results are described in the previous chapter. In this Chapter, we introduce the simulation results of the DLL-based clock generator first. And follow by the measurement settings and the measurement results of the project.

4.1 System simulation results

In this project, the whole system simulation of DLL-based clock generator includes lock transient simulation, pulse reshaper effect on the static phase error, and the output signal transient waveform and jitter performance.

DLL Locked transient simulation

To generate the output clock signal, the DLL loop must lock in the correct lock condition first, and the each outputs phase of delay line will evenly space one reference period time. Than, according to the control signal, edge combiner will combine the multi-phase signals to produce the multiplied output signal. This project use the startup circuit to set the initial conditions of the system, the Vctrl will start from the VDD (delay line at the minimum delay state) and falling to the voltage that

delay time of the delay line just one reference period time. Fig. 4.1 shows the DLL lock in REF period 2.5ns, 3ns, and 4.2ns. In the three difference periods of REF, the DLL loop can always locked.

(a) REF_2.5ns

(b) REF_3ns

(c) REF_4.2ns

Fig. 4.1 DLL lock in REF period 2.5ns, 3ns, and 4.2ns.

Pulse Reshaper Effect on the Static Phase Error

Fig. 4.2 shows the DLL locked static phase error difference between PFD with and without pulse reshaper circuit. Table 4.1 shows the performance of pulse reshaper.

(a) Without pulse reshaper

(b) With pulse reshaper

Fig. 4.2 Static phase error with and without pulse reshaper.

Table 4.1 Static phase error with and without pulse reshaper.

Tref Without pulse reshaper With pulse reshaper

2.5ns 183ps 23.8ps

3ns 74.3ps 6.4ps

4.2ns 3.2ps 1.6ps

Output Signal Transient Waveform

Fig. 4.3 shows the clock generator’s output waveform, the clock signal almost has 50% duty-cycle, consistent with the previous introduction of edge combiner.

(a) REF_2.5ns, multiplied by 4, Fout = 1.6GHz

(b) REF_3ns, multiplied by 4, Fout = 1.33GHz

(c) REF_4.2ns, multiplied by 4, Fout = 952MHz Fig. 4.3 Transient waveform of the output signal.

Output Signal Jitter Performance

Fig. 4.4 shows the clock generator’s output signal eye diagram, it can see the jitter performance of the output signal. Table 4.2 shows the jitter performance of output signal.

(a) REF_2.5ns, multiplied by 4, Fout = 1.6GHz Fig. 4.4 Jitter performance of the output signal.

(b) REF_3ns, multiplied by 4, Fout = 1.33GHz

(c) REF_4.2ns, multiplied by 4, Fout = 952MHz Fig. 4.4 Jitter performance of the output signal.

Table 4.2 Jitter performance of output signal.

Tref Fout Jitter 2.5ns 1.6GHz 22ps

3ns 1.33GHz 23ps 4.2ns 952MHz 42ps

Post-Layout Simulation

After the pre-layout simulation to confirm the function of the project, we start to draw the layout of the project and tape-out to generate the chip. The post-layout simulation is show in Fig. 4.5; it includes lock transient, static phase error, output waveform and output eye diagram.

(a) REF_3ns Vctrl transient.

(b) REF_3ns static phase error.

Fig. 4.5 Post-layout simulation of the DLL-based clock generator.

(c) REF_3ns, multiplied by 4, Fout = 1.33GHz

(d) REF_3ns, multiplied by 4, Fout = 1.33GHz

Fig. 4.5 Post-layout simulation of the DLL-based clock generator.

Layout and Performance Summary

The layout of the designed DLL-based clock generator is show in Fig. 4.6. The whole chip occupies an area of 0.65×0.76

mm

2. The performance summary of the project is listed in Table 4.3. The jitter of the output signal is worse than pre-layout simulation. It is because the careless of the loading match between each delay stage.

In the static phase error aspect, the post-layout simulation still shows the good performance.

Table 4.3 Performance summary.

Pre-sim Post-sim

Operating Frequency Range 200MHz~450MHz 200MHz~400MHz Output Frequency Range 100MHz~1.8GHz 100MHz~1.6GHz

Static Phase Error

Power Dissipation 25.88mW @ 1.33GHz 26.36mW @ 1.33GHz

Layout Area N/A 0.65×0.76

mm

2

Fig. 4.6 Layout of the DLL-based clock generator.

4.2 Measurement Settings

The clock generator receives reference signal to generate multiplied output signal.

The jitter performance of the output signal is measured at the oscilloscopes shown in Fig. 4.7(a). In a DLL-based clock generator, a reference pulse signal is critical, and such signal is generated by pulse generator shown in Fig. 4.7(b).

Fig. 4.7 The photograph of the (a) oscilloscopes TDS7704B, (b) pulse generator Anritsu MP1763C.

The prototype PCB is shown in Fig. 4.8. The chip is measured by a chip on PCB assembly and the measurement environment is setup as Fig. 4.9. The control signals are produced by the switches made on PCB. An on chip open drain buffer delivers the output signal through the bias-Tee to the oscilloscopes.

Fig. 4.8 Prototype PCB.

Fig. 4.9 Measurement setup.

4.3 Measurement Results

The wide-range, programmable DLL-based clock generator has been fabricated in a 0.18-μm CMOS technology. Fig. 4.10 is a photograph of the die, whose area is 0.65mm by 0.76mm. The measurement results are presented in the following.

Fig. 4.10 The photograph of the die.

The measurement operating frequency range is from 200MHz to 400MHz, the results are shown in Fig. 4.11 to Fig. 4.13. The measurement results show that the designed clock generator can generates three numbers of multiplication factors, such as 1/2, 1, and 4. When operating in multiplied by 4 mode, the device’s maximum output frequency is 1.2GHz, the peak to peak jitter is 128ps, and the power consumption is 63mW (whole power). The summary of measurement results is shown in Table 4.4.

There are five numbers of multiplication factors can not be produced. This is caused by the fail detection of the PFD. PFD detects the rising edges of reference

signal and feedback signal. But designed enable function of PFD is just control the feedback path of PFD to disable the PFD operation. Therefore, when the PFD be enable after the rising edge of reference signal or feedback signal, there will miss one pulse of up or down message. This phenomenon will make the DLL loop fall into false locking situation and produce the unexpected output signal.

Fig. 4.11 Waveform of the output signal, REF = 200MHz (a) multiplied by 1/2 and (b) multiplied by 1

Fig. 4.12 Waveform of the output signal, REF = 400MHz (a) multiplied by 1/2 and (b) multiplied by 1

Fig. 4.13 Waveform of the output signal, REF = 300MHz (a) multiplied by 1/2, (b) multiplied by 1 and (c) multiplied by 4

Table 4.4 Measurement summary.

Post-sim Measurement

Operating frequency range 200MHz ~ 400MHz 200MHz ~ 400MHz Output frequency range 100MHz ~ 1.6GHz 100MHz ~ 1.2GHz

Peak-to-peak jitter

Peak-to-peak jitter

相關文件