Thesis Overview - 使用改進式電流驅動邏輯閂鎖器之百億位元/每秒資料與時脈回復電路

Chapter 1 Introduction

1.5 Thesis Overview

This thesis comprises five chapters of which this introduction is the first. The Chapter 2 describes the basics of the simple CDR circuit and the trade-off involving in several design parameters of the system would be discussed. Chapter 3 discusses the MOS current mode logic circuit design in latch circuit for high speed operation and the improved MOS current mode logic which can reduce the output voltage fluctuation. Also we detail all transistor level of the simple CDR. In Chapter 4, we present the VLSI implementation and several physical design strategies used to minimize noise coupling and facilitate testing. Finally, we summarize the researched simple CDR circuit of this thesis in Chapter 5.

Chapter 2 Clock and Data Recovery Architectures

The NRZ data stream received and amplified by an optical receiver suffers both inter-symbol interference (ISI) and noisy. For subsequent processing, timing information, e.g., a clock, must be extracted from the data so as to allow synchronous operations. Furthermore, the data must be retimed such that the jitter accumulated during transmission is removed. The task of clock extraction and data retiming is called “clock and data recovery” (CDR).

2.1 Principles of Operation

This Chapter discusses the design issues related to the simple CDR architectures.

A common technique to design an integrated CDR is to use a phase-locked loop (PLL) to generate the frequency of the received NRZ data and compensating for process and temperature variations[5][6]. In all of these PLL-based CDR, they can be divided into two groups according to types of their Phase Detector (PD), the linear proportional Phase Detector[7], and the Bang-Bang Phase Detector. The Bang-Bang CDR architectures have recently found wide use in high-speed applications. The most common Bang-Bang CDR is based on Alexander phase detector[8]. In this work, the simple CDR contains several major building blocks.

(1)Phase detector : A two-level output circuit senses the phase difference between the input data and recovered clock only on data transitions.

(2)Voltage-to-Current (V/I) converter : It converts the phase detector circuit’s digital output voltage to current signal.

(3)Loop filter : It suppresses the high-frequency components of the PD output and presenting the dc level to the oscillator.

(4)Voltage-controlled Oscillator : A local clock generator that is aligned to the incoming NRZ data. Recovered clock from the VCO is used to sample the incoming NRZ data.

Figure 2-1 shows the block diagram of a proposed simple CDR[1][9]. At first, the Bang-Bang phase detector compares the incoming NRZ data and recovered clock.

Secondly, the V/I converter senses the PD output voltage to generate current charging or discharging the loop filter. By using the V/I converter output current to charging or discharging the loop filter, the loop filter provides the voltage for the voltage-controlled oscillator to lock at the same frequency with the incoming data.

Finally, the operation is completed and the PD retimes the data inherently.

Figure 2-1 A simple CDR block diagram

2.2 CDR Fundamental

Generally, the task of the CDR architectures is to recovery the phase and frequency information from the input by extracting the clock from data transitions and retimes the input data stream. How can the simple CDR circuit provide these functions? In the following subsections 2.2.1 – 2.2.4, we will discuss the simple CDR building blocks in detail.

2.2.1 Bang-Bang PD

The phase detector is important in detecting the purity of the clock and data recovered from the received NRZ data. The phase detector must have the capability of deal with random NRZ data and recover the clock that is associated with the data stream. We usually use a linear phase detector or a digital Bang-Bang phase detector.

A linear phase detector suffers from nonlinearity of non-uniform data patterns. In addition, it is difficult to design and is highly sensitive to mismatch. The Bang-Bang phase detector is less sensitive to data patterns. It also provides simplicity in design and better phase adjustment at high speed in spite of higher jitter.

In this work, we use the Alexander phase detector. The Alexander PD use three data samples, S1~S3, which is sampled by three consecutive clock edges to detect whether a data transition is present and whether the clock leads or lags the data.

Figure 2-2 illustrates the Alexander PD principle. Figure 2-3 shows the circuit topology. The Alexander PD consists of four D Flip-Flops and two XOR gates. The D Flip-Flop FF1 samples the incoming NRZ data stream on the rising edge of CLK and D Flip-Flop FF2 only delays the result by one clock cycle. The D Flip-Flop FF3

samples the incoming NRZ data stream on the falling edge of CLK and D Flip-Flop FF4 just delays this sample by half a clock cycle.

Clock lead

Figure 2-2 The Alexander PD principle

FF1

Figure 2-3 The Alexander phase detector

The Alexander PD uses these three consecutive samples, S1~S3, in one data period to determine whether the clock leads or lags the data. If the clock leads, then the last samples, S3, is unequal to the first two. Conversely, if the clock lags, then the first sample, S1, is unequal to the last two. In these condition, we take S1♁S2 and S2♁S3

to provide the clock lead-lag information:

(a) If S1♁S2 is low and S2♁S3 is high, then the clock leads the input data (b) If S1♁S2 is high and S2♁S3 is low, then the clock lags the input data (c) If S1♁S2 and S2♁S3 is the same, then there is no data transition

The samples, S1~S3, and the clock phase condition compared with the input data is showed at Table 2-1

S1 S2 S3 相位關係

0 0 0 No transition

1 0 0 Clock lag

1 1 0 Clock lead

1 1 1 No transition

0 1 1 Clock lag

0 0 1 Clock lead

Table 2-1 Samples and the clock phase condition

Let us examine the waveforms at various points in the Alexander PD to get more insight into its operation. As illustrated in Figure 2-4, the first rising edge of CLK makes the FF1 to sample a high data level and the FF2 sample a low data level which is the prior output of the FF1. On the falling edge of CLK, the FF3 samples a low level on the input data. The second rising edge of CLK then accomplishes three tasks: it makes the FF1 to sample a low level on the input data, besides it produces a delayed version of the first sample at the output of FF2 and makes the FF4 to reproduces the FF3 output. The values of S1, S2, and S3 are therefore valid for comparison at t = T1, remaining constant for one clock period. As a result, the XOR gates can generate valid outputs simultaneously and determine the phase difference between the clock and the input data. If there is no data transition, the values of S1, S2, and S3 are the same and no action is taken.

Figure 2-4 The waveforms of Alexander PD when clock lags

∆Ө

V_out

Figure 2-5 Bang-Bang PD characteristic

From above studying, the PD’s behavior is showed in Figure 2-5. The ∆ is the θ phase difference between Din and clock signal. If the clock lags (∆ < 0), the average θ PD output, ( lead – lag )avg , is a high negative value. Conversely, if the clock leads (∆ > 0), then ( lead – lag )θ _avg is a high positive value. The Alexander PD exhibits a very high gain in the vicinity of ∆ =0, so the CDR loop locks such that Sθ 2 coincides with the data zero crossings and S1 appears in the center of the data eye. So the S1 can also represent the retimed data.

2.2.2 Voltage-to-Current Converter

The Alexander PD outputs drive the voltage-to-current converter. The two output signals ( lead, lag ) are averaged in the current domain, and the result is applied to the loop filter. Because the high gain of the Alexander PD yields a small phase offset under locked condition, this simple CDR circuit need not incorporate a charge pump.

In the absence of data transitions, V/I converter generates a zero dc output, leaving the oscillator control undisturbed. As a result, for long data urns, the VCO frequency drifts only due to device electronic noise rather than due to a high or low level on the control line.

2.2.3 Loop Filter

The loop filter works between the V/I converter and the voltage-controlled oscillator. The transfer function of the loop filter has a large influence on the properties of the CDR loop. Figure 2-6 shows the second-order low-pass filter[10][11].

It consists of a resistor Rp in series with a capacitor Cp and a capacitor Cs in parallel.

The capacitor Cs provides a higher pole to reduce the ripple noise of the VCO voltage-controlled line. The loop filter provides a pole in the original to provide an infinite DC gain to get the zero static phase error, and a zero to improve the phase margin to ensure the closed loop stability of the CDR loop.

Figure 2-6 A second-order loop filter

The total transfer function of the loop filter is

The voltage-controlled oscillator generates an output waveform with its frequency controlled by the control voltage, as shown in figure 2-7(a). Figure 2-7(b) shows the characteristic of VCO, the VCO frequency ω0 is a linear function of the control voltage Vc. The curve need not be linear, but it usually simplifies the design if the slope is the same everywhere. The slope Kvco is the gain of the VCO. Gain and linearity are most important to CDR systems. We will introduce some specifications of VCO [1][9][]10]:

Tuning range: The tunable frequency range of the VCO must be able to cover the entire required frequency range of the interested application.

Tuning linearity: An ideal VCO has a constant VCO gain, Kvco, at the required tuning range.

Power supply sensitivity: In SOC design, the switching noise induced by digital circuit will couple to VDD of a VCO and influence its output waveform. Hence, this effect must be as low as possible to reduce the VCO output jitter.

Phase stability: An ideal spectrum of the VCO output should be look likes Dirac-impulse. That is to say, the phase noise of the VCO must be as low as possible.

VCO

Figure 2-7 Illustration of the VCO (a) model of the oscillator, (b) characteristic

2.3 Analysis of Loop Performance

With the Bang-Bang PD characteristic, the clock falling edge must sample zero-crossing points of the input NRZ data. Even for a slight phase error, the PD generates a large output, driving the loop toward lock. We now consider a more realistic Bang-Bang characteristic, where the gain in the vicinity of ∆ = 0 is finite. θ The finite slope arises form D_FF’s metastability[1] : if the D_FF samples at the zero- crossing point of the input data, the output may not reach the full logical level in one bit period. If the Alexander PD locks, the ∆ approaches zero, the second sample, θ S2, falls in the vicinity of the data zero crossing, thereby driving FF3 and FF4 into metastability. In metastability, the XOR gates produce small differential outputs yielding a small average output for the overall PD. Thus, the CDR loop can lock such that the XOR gates experience metastable inputs most of the time. For the phase differences that are small enough to produce an unsaturated output, the PD characteristic likes linear[1]. We imitate the analysis of a linear PLL-based CDR. The approximated model of the simple CDR with an Alexander Bang-Bang phase detector is shown in figure 2-8. Where Kd is the gain of the phase detector, Kvco is the gain of the VCO, the transfer function of the loop filter is F(s). We can observe the model is

Figure 2-8 Model of the CDR

We first consider the open-loop response to obtain considerable insight into the design of the CDR circuit. This response can be derived by breaking the loop at the feedback input of the phase detector. The output phase, θ_o

( )

s , is related to the input phase,

The open-loop response, H(s), then is given by

( ) ( )

We use the loop filter in Figure 2-6, then Eq. (2.3) becomes

( )

Where K =K_d ⋅K_h⋅K_VCO is the loop bandwidth of the CDR

Figure 2-9 shows the bode plot of the transfer function. We can see the phase of H(s) is -180^o at ω= 0, and the zero ωz introduce the phase shift of +90^o and the pole ωp

introduce the phase shift of -90^o. The phase margin could be described as follows

⎟⎟

Another way to approximate this parameter is to ignore the shunt capacitor Cs. Since

Cp >> Cs, the zero,

Hence, Eq. (2.4) can be re-written as

( )

¹ ₂

For the CDR to be stable, the following condition should hold:

To consider the examples quoted above, this design guarantees that the phase margin is good enough to the loop[10][12].

2.3.1 approximated frequency response with Loop filter

In contrast to the approximated analysis above, the other popular method to analysis a CDR is by the closed-loop transfer function which is written is written is in Eq. (2.8) and loop filter transfer function is approximated to

( )

n K ω

ω = ⋅ (2.11)

The damping factor and natural frequency characterize the close-loop response. The close-loop frequency response of the CDR for different values of damping factor are normalized to natural frequency as shown in Figure 2-10. This figure shows that the CDR is a low-pass filter to the phase noise at frequency below ω_n. For small value of ξ, the curve is shaper than those of large value of ξ. In the CDR design, the loop is designed to be over-damping(ξ>1) to avoid the jitter peaking effect. This also helps increase the phase margin of the open-loop transfer function.

Magnitude(db)

Frequency(Hz)

Figure 2-10 The close-loop frequency response of the CDR

Figure 2-11 shows the transient step response of the CDR for different value of damping factor and for time normalized to

ωn

1 . The step response is generated by

instantaneously advancing the phase of the input by one radian and observing the output for different damping levels in the time domain. The CDR output initially responses rapidly but takes a long time to the steady state for the damping factor larger than one; i.e., the system is over-damped. We can find that the rate of the initial

CDR design.

S(t)

Time ( 10us)

Figure 2-11 The close-loop transient step response of a CDR

Chapter 3 10 Gb/s Clock and Data Recovery Circuit Design

3.1 Introduction

This Chapter discusses the circuits design more detailed in transistor level. The design method could be applied to a CDR with the input data rate of 10 Gb/s. The process and model used for the circuit design is the TSMC 0.18 µm 1P6M CMOS process. We simulate the CDR with HSPICE to acquire the detail electrical behavior.

Moreover, the process variations and the temperature effects should be taken into account. We must simulate the CDR in high temperature, slow process and fast process besides the normal temperature and typical process. After that, the post-layout simulation including the circuit parasitic resistances and capacitances must be simulated.

3.2 Circuit Description

For the CDR circuit to handle high-frequency signals, the circuit must have fast switching speed. ICs operating at speeds greater than 10 Gb/s usually use GaAs MESFETs, GaAs HBTs, Si BiCMOS transistors. The power consumption of these processes, however, is relatively large because their supply voltage is high and their

and highly integrated for low cost. The CMOS transistors have the advantages of low power consumption and low cost, but they still rarely been used in high-speed systems because of their operation speed is too low.

3.2.1 High speed MCML Latch

Generally, a conventional CMOS inverter exhibits some drawbacks that prevent it from being vastly used in high-speed low voltage circuits. First, a CMOS inverter is essentially a single-ended circuit. In a multi-gigahertz frequency range, the short on-chip wires act as coupled transmission lines. The electromagnetic coupling thus causes serious operational malfunctioning in the circuits, particularly single-ended circuits. Beside, the pMOS transistor in a static CMOS inverter will severely limit the maximum operating frequency of the circuit. For circuit can correctly operate at 10GHz domain, we use the MOS current-mode logic (MCML) to take the place of conventional CMOS logic. MCML circuits can operate with lower signal voltage and higher operating frequency at lower supply voltage than static CMOS circuits. The MCML has extensively used to implement ultrahigh-speed buffers [13], [14], latches [14], multiplexers and demultiplexers [15], and frequency dividers [16].

Figure 3-1 shows inverters of the CMOS logic and the conventional MCML. The CMOS logic has the advantage of low power consumption, but its operation is relative slow. For example, the maximum toggle frequency of a conventional 0.18 μm CMOS inverter is only about 3.5 GHz. The power consumption of this CMOS logic is the product of the operation frequency and the charging and discharging power per unit switching. On the other hand, the power consumption of the MCML is the drain current of the current source transistor MNb. Therefore, the power consumption of the MCML is nearly independent of the operation frequency. The CMOS logic uses power only when charging and discharging, its power consumption is generally

smaller than that of the MCML. However, in the gigahertz frequency range, the power consumption of the CMOS logic become larger than that of the MCML, as shown in Figure 3-2 [15]. This means that the MCML is more suitable for low-power operation in the gigahertz frequency range.

Figure 3-1 Inverter circuits of the (a) CMOS logic and (b) MCML

Figure 3-2 Power consumption of the MCML and CMOS logic

A MCML latch consists of an input tracking stage, MN1 and MN2, utilized to

MN4, being employed to store the data. Figure 3-1 demonstrates a CMOS CML latch circuit.

Figure 3-3 Circuit schematic of a CMOS CML latch

The track and latch modes are determined by the clock signal inputs to a second differential pair, MN5 and MN6. When the signal CK is “high”, the tail current Iss entirely flows to the tracking circuit, MN5 and MN6, thereby allowing Vout to track Vin. In the latch-mode, the signal CK goes low, the tracking stage is disabled, whereas the latch pair is enabled storing the logic state at the output.

To achieve the best performance in a MCML latch, a complete current switching must take place, and the current produced by the tail current needs to flow through the ON branch only. So the latch output voltage swing of single end is R_D⋅I_SS (RD is the equivalent resistance of MP1 or MP2 when it is worked at linear region). The value of load resistance depends on both the tail current and voltage swing requirement.

Reducing the load resistance RD and increasing tail current Iss is one way to lower the transient time without changing the output swing voltage, but increasing the tail current will also increase the power consuming. The sizes of pair transistors (MN1-MN2 and MN3-MN4) are increased by increasing the tail current and also are

increased by reducing the magnitude of single ended swing voltage. The sizes of the other transistors (MN5, MN6 and MNb) have the effect of the tail current only. For high speed switching between MN1-MN2 and MN5-MN6, we let the size of the transistor as large as possible. On the other hand, the increasing of the transistor size will make the parasitic capacitance serious. The parasitic capacitance will slow the operation speed of the circuit. We must get the best balance between the transistor size

在文檔中使用改進式電流驅動邏輯閂鎖器之百億位元/每秒資料與時脈回復電路 (頁 16-0)