T ECHNIQUE OF CDR - Background Study - 適用於展頻時脈與資料回復電路之漸增數位化頻率補償

Chapter 2 Background Study

2.1 T ECHNIQUE OF CDR

Basic method of CDR is shown in Figure 2.1. The noisy and asynchronous data is received from channel. We need a CDR to recover the clock and resample the data.

The main function of CDR is to synchronize and reconstruct data, and reduce the accumulated jitter reduction.

Serial Data Input

Decision Circuit

Recovered Clock

D Q

CDR Circuit

Recovered Data

Generally speaking, the CDR has two basic architectures. The PLL based CDR and the oversamoling based CDR use different concepts to architect a CDR. We discuss these two types of CDR in next paragraphs.

PLL based CDR

Figure 2.2 shows the basic architecture of the PLL based CDR[2]. The difference between traditional PLL and PLL based CDR is the retiming circuit implemented by a D flip flop(DFF). The random data instead of reference clock is used as input.

PLL based CDR comprises a phase frequency detector(PFD), a charge pump(CP), a low-pass filter(LPF), a voltage-controlled oscillator(VCO), and a retiming circuit. The PLL based CDR uses the PFD to detect the timing difference between the input data and the sampling clock. In order to adjust the VCO control voltage and filter out high frequency noise, the CP and LPF are designed. Finally, according to the control voltage, the VCO generates the sampling clock until the sampling clock and input data have no phase difference.

Figure 2.2: PLL based CDR

There is another similar type of CDR architecture, called DLL based CDR[2].

It replaces the VCO by a voltage control delay line(VCDL). Unlink the VCO, the VCDL adjusts the phase rather than the frequency.

PFD VCO

Retiming

Data in

Charge

Pump LPF

Recovery Data

Chapter2 Background Study

Oversampling Based CDR

Figure 2.3 shows the block diagram of the oversampling CDR[3]. The input data is sampled by a certain number of parallel samplers simultaneously. We also need a multi-phase clock generator to generate multi-phase clock. The outputs of the parallel samplers are stored. The bit boundary detection detects the data boundary by a majority voter. Finally, according to the bit boundary detection, we obtain the optimal clock to sample the data. Therefore, the data selector is implemented by a multiplexer to decide which sampled result is the recovered data.

Figure 2.3: Oversampling based CDR

Figure 2.4 is an example of the oversampling technique. In this example, the data is sampled by three phases in every bit time. Every neighboring sampled results is exclusive-ored to detect the data boundary. According to the accumulated number of transitions, we decide the one of the maximum count to be the boundary.

In this example, the maximum accumulated transition is six. We derive the transition edge is between phase 1 and phase 2. Finally, the best phase to sample is phase 3.[4]

Data

Sample Storage

Bit Boundary Detection Multi-Phase

Clock Generator DFF DFF

DFF DFF DFF Ref. Clock

Recovery Data Phase Detection

Parallel Samplers Data

Selector

Figure 2.4: Timing diagram of the oversampling

Comparison

Two different types of CDR architecture are presented in the previous paragraphs. Table 2.1 lists the comparison between the PLL based CDRs and the oversampling based CDRs. Generally speaking, PLL based CDRs are an analog approach and oversampling based CDRs use a digital approach. Therefore, oversampling based CDRs are easy to be redesigned when the process technology is changed. It is one of the important advantages of the oversampling based CDRs. In Table 2.1, we compare some features of CDRs to understand the advantages and drawbacks in these two types of CDRs.

P ₁ P 2 P ₃ P ₁ P₂ P₃ P₁ P₂ P₃ P₁ P₂ P₃ P₁ P ₂ P ₃ P ₁ P ₂ P₃ Input data

Sampling Phases

Sampled Value 0 1 1 11 1 1 0 0 00 0 0 1 1 11 1 1 0 0 00 0 0 1 1 1 1 1 1 0 0 Indicate Transition 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0

Accumulate Transition 6 0 0

Transition Edge judgment P₁-P₂

Phase Picked P₃

Chapter2 Background Study

PLL based CDR Oversampling based CDR

Resolution High Low

Locking Time Low Short

Noise Immune Bad Good

Hardware Overhead Small Large

Table 2.1: Comparison of PLL based CDR and oversampling CDR

2.2 Basic of Spread Spectrum

In order to reduce EMI, there are many techniques proposed. The spread spectrum is one of these techniques. The spread spectrum utilizes the frequency modulation to distribute the power. This technique is described in Figure 2.5[1].

Originally, the total power is concentrated at certain frequency. It induces large EMI.

The spread spectrum reduces the maximum peak energy under the same total amount of energy. Only a small amount of variation in frequency is needed to obtain several decibels of energy reduction. In short, the spread spectrum is a popular, low cost, and efficient technique to reduce EMI

Figure 2.5: Comparison of non-Spread spectrum and spread spectrum

Figure 2.6 shows the Serial-ATA II requirement for 3Gbps transceiver systems [5]. The spread spectrum utilizes a 5000ppm down spreading and a 30~33kHz triangular profiles. According to this requirement, the lowest frequency is 2.985Gbps.

Non-Spread spectrum Spread spectrum

Figure 2.6: Spread spectrum requirement for Serial-ATA II

Down spreading frequency modulation ensures the highest frequency is below the original frequency, 3Gbps. Serial-ATA specification defines a 30~33 kHz triangular modulation rate. In this requirement, the frequency varies with time.

t f

3Gbps 30~33kHz SSC

non-SSC 2.985Gbps

(-5000ppm)

Chapter3 Frequency Compensation Technique

Chapter 3 Frequency Compensation Technique

Conventional CDRs, no matter PLL based or oversampling based, have less than 1000 ppm frequency tolerance. However, the Serial ATA requires a 5000ppm spread ratio. It induces very large jitter when input data has high frequency offset.

Therefore, we propose a frequency compensation technique to enhance the tracking ability. The main contribution of this technique is the reduction the jitter at high frequency offset.

3.1 Frequency Compensation Methodology

In traditional CDR, it is not easy to track large frequency variation because the bandwidth is limited. In other words, small bandwidth induces smaller jitter, but the tracking ability is weak. On the contrary, if the CDR bandwidth is designed too large, it is good for frequency tolerance, but it obtains large jitter.

In order to solve this trade off between bandwidth and jitter, we design

major purpose of the frequency compensation loop is to increase the CDR bandwidth. Only when the bandwidth is extended, the frequency tolerance is increased.

The methodology of frequency compensation is shown in Figure 3.1. In Serial ATA, the spreading is a triangular waveform with 33kHz modulation frequency. For the frequency changes, we detect the amount of frequency variation in a frequency compensation period. In this thesis, the frequency compensation period is T . In _s Figure 3.1, we detect the frequency increment in section A. Therefore, the frequency compensation loop provides the same amount of frequency to compensate this increment in section B and so on. According to this methodology, we compensate the frequency in every T . _s

Figure 3.1: The methodology of frequency compensation

3.2 The Proposed CDR Architecture

Figure 3.2 shows the proposed CDR architecture with frequency compensation.

This architecture can be separated into two parts. The first part is a phase locked t

1/33k

¼*1/33k

∆f1

Ts Ts

Compensated Frequency

∆f0

∆f

B A

Chapter3 Frequency Compensation Technique

phase selector. The second part is the frequency compensation loop. It comprises a pulse counter, a pulse accumulator, and a frequency error compensator (FEC).

Besides, we design a lock detector to control the confidence counter size.

Figure 3.2: Proposed CDR architecture

Phase locked loop

The phase locked loop can be simplified as shown in Figure 3.3. The phase detector (PD) is a bang-bang phase detector. The transfer curve of bang-bang phase detector is shown in Figure 3.4[7]. The PD can detect the relationship between input data and recovered clock. The PD output “LEAD” means the input data phase appears earlier than recovered clock and vice versa.

Figure 3.3: Simplified phase locked loop architecture Recovery

HRPD Variable C.C.

Figure 3.4: Transfer curve of bang-bang phase detector

The Confidence Counter(CC) is similar to a loop filter functionally. The confidence counter size decides the equivalent bandwidth. In this thesis, the confidence counter size is N . If the number of accumulated input signal (Lead/Lag/Hold) exceeds N, the confidence counter have an output Lead_ov or Lag_ov. Lead_ov and Lag_ov are the inputs of the phase control. The coarse tune controls the choice of two neighboring phases from external clock. The fine tune interpolate phase more precisely. The phase control is designed with coarse tune and fine tune. Finally, the phase selector adjusts the phase to track the input data phase.

In order to have more precise phase resolution, we use the interpolation technique in the phase selector. The advantage of high resolution is the jitter suppression. In other words, when this system is lock and stable, the recovered clock is lock between two phases. The higher the phase resolution, the smaller the jitter is induced.

This architecture is implemented in digital. It is another advantage of this architecture.

Frequency Compensation loop

The frequency compensation loop comprises three components. Figure 3.5 shows the block diagram of the frequency compensation loop.

PD out

Lead

Lag

∆φ

Chapter3 Frequency Compensation Technique

Figure: 3.5 Frequency compensation loop

The first one is the pulse counter. The pulse counter counts the number of pulse difference between lead pulses and lag pulses. This value means the frequency offset in a specific time. This specific time is the rate to update the number of compensated pulses. We denote the frequency compensation period Ts in this thesis. The pulse accumulator accumulates the pulses in every Ts. The final component of the frequency compensation loop is FEC. FEC can generate the same number of the pulses as the number in the pulses accumulation. And the pulse is generated as uniformly as possible. Finally, these pulses are used as the input of the phase control to adjust the phase in the phase selector. Every pulse adjusts the recovered clock one resolution.

3.3 Confidence Counter Analysis

The first key point in this design is to decide N. Intuitively, a smaller N produces an output quickly. That is, a smaller N represents a larger equivalent bandwidth.

In the phase locked loop, we derive closed loop transfer function by the following steps:

1.Calculate the equivalent bandwidth for N of 6:

From Variable sized C.C

Ts P0

Pulse Counter

Freq.

Error Comp.

Pulse Accum.

To Phase Control

(FSM)

Figure 3.6: Modify the confidence counter as a Markov chain[9] Using the first passage time, we modify (3-2) as

(n) (n-1) Now we use (3-3) to calculate the combinations of P for different _ijⁿ n. Therefore, we establish the matrix (3-4)

(3-4) Second, we use the expected value of conditional probability to calculate the time that confidence counter output occurs.

⎥ ⎥

Chapter3 Frequency Compensation Technique

In (3-5), p means the probability of lead. Therefore, we assume the input jitter is a Gaussian distribution in Figure 3.7. The probability of lead can be calculated as

p = f(x)dx

Figure 3.7: Gaussian distribution profile

In (3-5), parameter ζ is represented the number was circuited by dot square in (3-4).

We can extend the matrix to a general from by following derive.

0= 1

6 i+1

(3-8) means the excepted value for the confidence counter to obtain an output. We can derive the equivalent bandwidth from (3-8).

_{C_C} 1

= E[ ]

ω Δφ (3-9) 2. Rewrite the equivalent bandwidth in a general form:

(3-8) can replace the confidence counter size 6 by N. We obtain the equivalent bandwidth of confidence counter as

N k k -1

According to (3-10), the relationship between confidence counter size N and equivalent bandwidth is plotted in Figure 3.8.

Chapter3 Frequency Compensation Technique

900.00

281.25

38.09 53.05

80.36 136.36

29.47 24.26 20.97 18.73 0

100 200 300 400 500 600 700 800 900 1000

1 2 3 4 5 6 7 8 9 10

Figure 3.8: The relationship between N and equivalent bandwidth

From Figure 3.8, we achieve the result that the larger the confidence counter size, the smaller the equivalent bandwidth.

3. Phase selector transfer function:

Besides the confidence counter, there is another component in phase locked loop.

The phase selector adjusts the phase when the control digital code changes. We modify the relationship between digital code and output phase in Figure 3.9.

Therefore, we linearizes the relationship curve. Eventually, the transfer function of phase selector K_PI is represented in (3-20):

1 Digital Code

2π

16 (2π/16)*1

(2π/16)*2 2

Confidence counter size

Bandwidth (MHz)

∆φ

K_PI =^2p=^2p

16 L (3-11) where L is the interpolation steps

4. Phase locked loop transfer function:

From (3-10) and (3-11), the phase locked loop transfer function is derived under these assumptions:

Assumption 1: The input jitter is a Gaussian distribution Assumption 2: The phase selector transfer curve is linearized Assumption 3: Only consider the input jitter in ±3σ

Assumption 4: Δ is half the resolution. φ

The phase locked loop closed transfer function can be represented in (3-12) by these assumptions.

I closed

c_c

H (s)= K

(K +1)+ s ω

(3-12)

Where _{c_c} ^k_* ^{N k} ^k ^-1

i i

k=0

R R

2 2

= 2 { (N 2k) [Q( )] [1 - Q( )] T}

J J

6 6

ω π× ∑^∞ + ×ζ × ⁺ × ^{.(3 -13)}

In (3-12) and (3-13), the phase locked loop transfer function depends on four parameters.

N is the confidence counter size.

R is the phase resolution.

J is the peak to peak input jitter. i

T is the clock period.

Besides, P and ^* ζ_k are shown in (3-5) and (3-7).

Jitter tolerance is an important specification for CDRs. Jitter tolerance is defined as input jitter a receiver must tolerance without violating system’s BER

Chapter3 Frequency Compensation Technique

3.11[5]. The available region is upper the jitter tolerance mask. As a result, we design our desire curve as the dotted line. In Figure 3.10, if we deign the phase locked loop bandwidth at 0.4MHz, it suits for the jitter tolerance.

Figure 3.10: Jitter tolerance of Serial ATA II and desired curve

An approximation condition to avoid increasing the BER is[2]

in out

- < 0.5UI [1 - H(s)] < 0.5UI θ θ

θ . (3-14) We therefore can express the jitter tolerance as

_JT

closed

G (s) = 0.5

1 - H (s) . (3-15) The relation between jitter tolerance and phase locked loop closed transfer function is shown in Figure 3.11. According to the phase locked loop closed transfer function (3-12), we can calculate the equivalent bandwidth by different N . The result is shown in Figure 3.11. When N is 28, the equivalent bandwidth is 0.4MHz.

Eventually, in order to be easy for circuit design, we choose N =32 as the confidence counter size.

25000 fdata

Jitter Amplitude (UI)

0.1

1667 fdata

1.5

0.36M 0.5

Desired curve

Jitter frequency (Hz)

Figure 3.11: The equivalent frequency response of C.C. by different size

In short, in our design the confidence counter size is 32. As the result, the equivalent bandwidth satisfies the Serial ATA II jitter tolerance requirement.

3.4 Frequency Compensation Period Determination

The second key point in our design is to determine the frequency compensated period. In Figure 3.1, the spread spectrum is a triangular profile. Therefore, the amount of frequency increment is fixed. In Figure 3.12, we calculate the slope of the frequency offset.

0.4M -3dB

N=40 N=28

N=20

N=1

Chapter3 Frequency Compensation Technique

Figure 3.12: Relationship between Ts and frequency offset

According to the Serial ATA II specification[5], the modulation frequency is 33kHz, and spread ratio is 5000ppm. As the result, we can derive the equation:

s s

f = slope T = 330 T

Δ × × . (3-16) where fΔ means the frequency offset.

The longer the T is, the larger the frequency offset will be. But the longer the _s T , the more precious equivalent frequency offset be calculated. It is a trade off s

between the T and the frequency offset. In this design, we design the maximum _s error tolerance is two resolutions in T . We represent it as _s

s clk

2 1

32 = f

T f

41.68p = f T

× Δ

⇒ Δ ×

. (3-17)

In this design, the number of resolution steps is 32. And T multiplies _s f_clk means the number of clocks in T . Finally, (3-17) means at the maximum frequency _s offset, the maximum compensated error is 2 times resolution.

From (3-16) and (3-17), we can find the optimal T and the equivalent _s frequency offset. Figure 3.13 shows the results from these two equations.

k Slope ppm

33 1 2 1 5000

Ts Average frequency offset in

previous Ts.

Figure 3.13: Determination of T _s

The optimal T is 0.355us. In order to simplified the circuit design , we _s choose the T =0.314us. _s T is 0.341us represents that a _s T is 512-clock cycles. _s Therefore, in every T , the frequency offset is 112.53ppm. _s

Ts Δ f(ppm )

0.355u (533clock) 117.15

112.53

0.341u (512clock)

Chapter4 Implementation of Clock and Data Recovery

Chapter 4 Implementation of Clock and Data Recovery

In this chapter, we describe the detail of the circuit implementation. The proposed system block diagram is shown in Figure 4.1. Therefore, we have behavior simulation to verify this design functionally. Besides, the circuit level simulation predicts the performance. Finally, the layout is shown, and the test environment setup is proposed.

Figure 4.1: Proposed system block diagram Recovery

Clock

1.5G 8 phase Lead_p/

Lag_p

Lead_f/ Lag_f

HRPD Variable Ts C.C.

Pulse Counter

Pulse

Accum. FEC Phase Control

Phase Selector Lock

Detector SSC_Data

Input 3Gbps

4.1 Building Blocks

There are eight blocks in proposed CDR as Figure 4.1. We describe these blocks respectively.

Half Rate Phase Detector

The half rate phase detector (HRPD) detects two bit boundary is lead, lag or hold in a clock cycle. HRPD needs 4-phase clock source to sample the input bit stream. The considerations is shown in Figure 4.2

Figure 4.2: HRPD timing diagram

In Figure 4.2, P0 and P2 detect the boundary of the input data, and P1 and P3 can sample the data. The circuit implementation is shown is Figure 4.3.[10]. By this operation, there are two answers of lead, lag ,or hold in every clock cycle. It is the reason that the architecture is called half rate.

The output of HRPD has 4 bits. Lead 1 and Lag 1 represent the relation between the input data and recovered clock in the first bit boundary. Lead 2 and Lag 2 are for the second bit boundary. For example, if (Lead1,Lag1,Lead2,Lag2)=1000, it means that the first boundary leads the clock and the second boundary has no transition.

P0(1.5GHz) P1 P2 P3 Input data 3Gbps

Chapter4 Implementation of Clock and Data Recovery

Figure 4.3: Half Rate Phase Detector

Therefore, we need an encoder to transform the output of the HRPD output into 2’s complement format. Table 4.1 is the truth table of this transformation. If the Lead1 and Lead2 are 1s, we transform into the 010(+2). On the contrary, if the Lag1 and Lag2 are 1s, we transform into the 110(-2). The positive number means lead and the negative number means lag. Finally, the boolean function of the encoder is (4-1).

And, the encoder is implemented by static CMOS logic.

Table 4.1: Encoder truth table

Variable-Sized Confidence Counter

In Chapter 3, we have introduced the function of the confidence counter.

According to the analysis in Chapter 3, the confidence counter size decides the equivalent bandwidth. A smaller N is equivalent to the larger bandwidth and better tracking ability. In order to compensate initial frequency offset, we choose N to be 2 initially. Then, N is increased to 8. Finally, N is fixed at the desired value 32.

By adjusting the size of the confidence counter, we change the equivalent bandwidth.

Therefore, it is called variable-sized confidence counter. The circuit of the variable-size confidence counter is shown in Figure 4.4.

Chapter4 Implementation of Clock and Data Recovery

Figure 4.4: Variable-sized confidence counter

In Figure 4.4, a 6-bit adder is designed. SO[5:0] are output bits of the 6-bit adder. For example, when N is 2, we detect that when SO1 changes the value. For N of 2, 8, and 32, we detect SO1, SO3, and SO5. In Table 4.2, the boldface represents that bit value being changed. Simultaneously, the variable sized confidence counter generates a output pulse. Moreover, if the input of the variable size confidence counter is lead, it represents positive value. When the accumulated value is +2, there is a lead output pulse. On the contrary, when the accumulated value is -3, another lag output pulse is generated. It is an asymmetry decision.

Table 4.2: 6-bit adder output in variable sized confidence counter (partial)

The adjustment of confidence counter number is decided by the lock detector.

We will introduce the behavior and circuit of the lock detector later.

Consider a clock frequency of 1.5GHz, it means that confidence counter must achieve its function in 666.6666ps. In Figure 4.4, we design a 6-bit adder with delay as small as possible. As the result, the 6-bit adder is shown in Figure 4.5. It is a carry-look-ahead(CLA) structure[11]. In order to reduce the propagation delay, we implement these gates by the pseudo NMOS logic rather than the static logic. From the simulation, the critical path in this 6-bit adder is 460ps.

Figure 4.5: The structure of 6-bit adder

Chapter4 Implementation of Clock and Data Recovery

Phase Control

The phase control has two input source. One is the phase locked loop control and another control is from frequency compensation loop. Functionally, the phase

在文檔中適用於展頻時脈與資料回復電路之漸增數位化頻率補償 (頁 14-0)