• 沒有找到結果。

Chapter 3 Architecture of Data Recovery System

3.2.2 Delay Selecting

Three quarter steps oversampling does reduce the size of PLL, but it also has another problem. There are only 21 phases used to sample data when three times oversampling is used, but there will be 28 phases used to sample data when three quarter steps oversampling is used. Using more different phases to sample data will make the layout of CDR more complex. Besides, because the sampling clock phases used in three quarter steps oversampling is more than that used in three times oversampling, the three quarter steps oversampling CDR need more MUXs to implement the motion of selecting sampling clock phases.

To overcome this problem, a new architecture of a CDR is presented in following.

Besides using three different sampling clock phase oversample data stream, using the same sampling clock phase sample three different delayed data can also detect the happen of skews too [9]. By using VCO cells as delay cell, the CDR can delay input data stream one phase, two phases, and three phases respectively. Using these three the same data streams delayed one phase, two phases, and three phases as a detection window, the CDR can use one sampling clock detect whither the edge happens in this windows as shown in Fig. 3.18.

If there is any edge happens in the detection window, the CDR will change the delayed time up or down one phase in next clock period.

Fig. 3.19 shows the operation of delay selecting. If the edge happens between the data delayed one phase and the data delayed two phases, the delay time of input data stream would be decrease one phase in next clock period. If the edge happens between the data delayed two phases and the data delayed three phases, the delay time of input data stream would be increase one phase in next clock period. The delay time would be changed until there no edge happens in the detection window.

Fig. 3.20 shows the architecture of the delay selecting CDR. To implement the operation of delay selecting, the CDR use VCD cells as delay cells create three data stream delayed in different phased. One of these three delayed data streams would be selected and the selected stat stream would be sent into the detection window. In the detection window the selected data stream is delayed into three different delayed data streams again. In detection window the CDR can use only seven different sampling clock phases to sample these three data stream and detects whether any edge happen. The detection window would send out “up” and “down” signals in every clock period. The same as the phase selecting CDR, these “up” and “down” signal must pass a voter and a DLPF. After passing the voter and the DLPF, if a real “up” or “down” signal is send out, the delay selector would change the selected data stream up or down as shown in Fig. 3.21. If the skew is over a data step time the information between input signals is not enough to detect whither the skew is a lead skew or a lag one. To avoid this undetected skew, LVDS specificity would limit the skew range between every two different channel. Because the skew range between every two different channel is limited in one data step time, the delay selector only need three different delay timing, one phase delay, two phases delay, and three phases delay, to be selected to cancel the skew.

By using delay selecting to implement three quarter steps oversampling the CDR can

reduce the number of sampling clock phases from 28 to 7 and also reduce the number of MUXs. On the other hand, the layout area is reduced and simplified.

DRC

Fig. 3.1 Typical FPD Link

d30 d31 d32 d33 d34 d35 d36 clock

Data3

d20 d21 d22 d23 d24 d25 d26 Data2

d10 d11 d12 d13 d14 d15 d16 Data1

d00 d01 d02 d03 d04 d05 d06 Data0

Fig. 3.2 Timing relation between clock and serial data streams

d00 d01 d02 d03 d04 d05 d06 d10 d11 d12 d13 d14 d15 d16 d20 d21 d22 d23 d24 d25 d26 d30 d31 d32 d33 d34 d35 d36 clock

Data3 Data2 Data1 Data0

Sampling clock phases

Fig. 3.3 Operation timing in ideal case

d00 d01 d02 d03 d04 d05 d06 d10 d11 d12 d13 d14 d15 d16 d20 d21 d22 d23 d24 d25 d26

d30 d31 d32 d33 d34 d35 d36 clock

Data3 Data2 Data1 Data0

Sampling clock phases

t

skew

Fig. 3.4 Operation timing in real cases

jitter

d0 d1 d2

data stream

sampling phases

Fig. 3.5 Operation timing when jitters happen, d0 is double sampled and d1 is missed

d00 d01 d02 d03 d04 d05 d06 d10 d11 d12 d13 d14 d15 d16 d20 d21 d22 d23 d24 d25 d26 d30 d31 d32 d33 d34 d35 d36 clock

Data3 Data2 Data1 Data0

Sampling clock phases

Fig. 3.6 Operation timing of three times oversampling

d00 d01 d02 d03 d04 d05 d06 d10 d11 d12 d13 d14 d15 d16

d20 d21 d22 d23 d24 d25 d26 d30 d31 d32 d33 d34 d35 d36

clock

Fig. 3.7 Operation timing of three times oversampling when skews happen

d0 d1 d2 d3 d4 d5 d6

Fig. 3.8 (a) Lag, (b) lead and (c) lock states of three times oversampling

Data

Fig. 3.9 Architecture of a traditional CDR

Data sampled by phase0

Fig. 3.10 Timing relation between sampled data

d0 d1 d2 d3 d4 d5 d6

Fig. 3.11 Operation of phase detector when jitters happen

d0 d1 d2 d3 d4 d5 d6

Fig. 3.12 Operation of phase selector

jitter

Fig. 3.13 Data sampling timing when jitters happen

PFD Charge

Fig. 3.14 Typical architecture of a PLL

VCO

Fig. 3.15 Timing relation between VCO cells outputs

VCO

(b) three quarter steps oversampling (a) three times oversampling

Fig. 3.16 Comparison of VCO cells number

d0 d1 d2 d3 d4 d5 d6

Fig. 3.17 Operation of “three quarter steps oversampling”

data 50 %

data 66.7 %

Three Times Oversampling Three Quarter Steps Oversampling

Fig. 3.18 Comparison of eye diagram tolerance in lock state

CLKin

clk0 clk1 clk2 clk3 clk4 clk5 clk6

clk0 clk1 clk2 clk3 clk4 clk5 clk6

clk0 clk1 clk2 clk3 clk4 clk5 clk6

Fig. 3.19 Operation of the detection window

Fig. 3.20 Architecture of a “delay selecting” CDR

clock

clk0 clk1 clk2 clk3 clk4 clk5 clk6

Fig. 3.21 Operation of the “delay selecting” CDR

Chapter 4

Building Blocks of Delay Selecting CDR

Fig. 3.20 shows the architecture of the delay selecting CDR. These building blocks of the delay selecting CDR would be presented in this chapter.

4.1 LVDS I

NPUT

B

UFFER

Fig. 4.1 shows a traditional design of a LVDS receiver buffer. The differential input signal is detected by the Schmitt trigger (M1 ~ M6 and M7 ~ M10), which translates the detected signal into the full swing output Vout. However, in this receiver buffer the lower bound of input signal common mode range is limited by M3 and M4. In IEEE LVDS standard the specificity of the LVDS levels at the receiver input are (4-1) and (4-2).

idth 100

VmV

0

mV

≤ ≤

V

i 2400

mV

(4-1)

(4-2)

Thus, when the common mode voltage of input signal is close to 0mV, M1 and M2 in Fig. 4.1 would enter into the triode region to keep the Vgs of M4 and M5 over Vth. Because M1 and M2 operate in triode region, the voltage gain of this design in Fig. 4.1 would be significantly reduced.

To overcome this problem of input common mode range in Fig. 4.1, a new LVDS receiver buffer is reported [10]. Fig. 4.2 shows the new design. To solute the problem in traditional design, the new design is cascoded another buffer, first stage buffer, before the Schmitt trigger. The voltage gain of the first stage buffer is almost insensitive to the input

signal common mode voltage. Because the differential signal after the first stage is almost irrelevant to the input common mode voltage, the Schmitt trigger can be implemented in NMOS type, which has better frequency response the PMOS type. However, in reference [10] the new designed receiver input buffer is implemented in 3.3V devices. In the thesis the LVDS CDR is implemented in 0.13 mμ 1.2V / 3.3V CMOS process. In order to translate the input signal from the 3.3V LVDS signal into a 1.2V full swing signal, the receiver input buffer in this thesis is implemented in 1.2V, Vddl, beside the current source supporting current to M1 and M2 is connect to 3.3V, Vddh, as shown in Fig. 4.3. Concerning in the problem of gate oxide reliability the first stage buffer M1 ~ M6 must designed in 3.3V device.

To receive a high speed input signal the frequency response of the receiver input buffer is important. If the receiver must receive a signal in 1.25 Gb/s, the frequency response bandwidth of the input buffer must higher than 625 MHz. To make sure the frequency response bandwidth is higher enough, the bias voltages Vb1, Vb2 and impedance value of R1

and R2 is important. Fig. 4.4 shows the frequency response simulation result of the LVED receiver input buffer designed in this thesis.

4.2 D

ELAY

S

ELECTOR

After input buffer input data signal is translated from 3.3V LVDS signal into full swing 1.2V digital serial data stream. This serial data stream would be send into the delay selector.

Fig. 4.5 shows the architecture of delay selector. Delay selector is composed of three delay cells and one three to one MUX. By using these three delay cells the input serial data stream is delayed into three different time delayed data streams. One of these three different time delayed data streams would be selected according to the operational code send from the shift selector. Fig. 4.6 shows the circuit of the delay cell. Controlling bias voltage at Vbp and

Vbn, the pull up and pull down current of the delay cell would be changed. According to different pull current, the delay cell can make the input signal delayed in different time. To make sure each delay time is equal to a quarter data step, the delay cell is design in the same scale with the VCO cell in the PLL and the bias voltages Vbp and Vbn is also provide by the PLL.

The signals after delay cells are differential and not a full swing signal. To pull these signals into full swing single ended signal, a different to single ended converter is needed.

Fig. 4.7 shows the different to single ended converter. Fig. 4.8 shows the simulation result of the delay cells and differential to single end converters. After delay cells and differential to single end converters, the send out signals, D0 ~ D2, are full swing data streams delayed in different phases respectively. The delay selector would select one of these three data stream to cancel the skew between the input data stream and the clock. A three to one MUX is used to select the correct data stream. Fig. 4.9 shows the schematic diagram of the three-to-one MUX.

4.3 D

ETECTION

W

INDOW AND

D

ATA

S

AMPLER

After the delay selector the selected data stream would be send into the detection window. The detection window is composed of three delay cells, which is the same as that used in the delay selector. In the detection window the selected data stream would be delayed in different phases. After detection window these data stream delayed in different phases would be sampled by different sampling clock phases in data sampler as shown in Fig. 4.10. By sampling these delayed data streams the CDR can get the logic state of the input data stream at different moment. Fig. 4.11 shows the timing relation between sampled data streams, sampling clock phases, and sampling results.

4.4 S

YNCHRONIZER AND

C

ONTROL

L

OGIC

As shown in Fig. 4.11 the sampling results, D0 ~ D20, are asynchronous. To simplify the circuit after the sampler in this CDR a synchronizer is used to synchronize these sampling results. These synchronized data stream would be send into control logic. In control logic a phase detector is used to detect whether a skew happen in the detection window. Fig. 4.12 shows the schematic diagram of the phase detector and the truth table of the phase detector. If the input signal of the phase detector is “001” or “110” the phase detector would assume that in this detection window a translation edge happens in the data stream behind the center sampling clock phase and sends out a “down” signal to ask the delay time of the selected data stream lag a quarter step time in next clock period. On the contrary, if the input signal of the phase detector is “100” or “011” the phase detector would assume that in this detection window a translation edge happens in the data stream before the center sampling clock phase and sends out a “up” signal to ask the delay time of the selected data stream lead a quarter step time in next clock period.

Besides the skews between data stream and the clock signal the jitter in data stream would make the phase detector send out “up” or “down” signal too. As the result, this requirement sent out by the phase detector would not be accepted immediately and a voter and a DLPF (digital low pass filter) are used to analysis these “up” and “down” signals and verify that a skew between the data stream and the clock signal really happens.

Because there are seven data be serialized into the input data stream during each clock period, the CDR need seven phase detectors to detect whether any translation edge happen in each data step. In another word, there would be seven detection results sent out in each clock period. If a skew really happen between the data stream and the clock these seven detection results should be the same or no detection result, no edge happen in this data step.

However, the fitter in the data stream would make some wrong detection result as shown in

Fig. 4.13. To avoid the wrong shifting on delay time the voter is designed to make sure the requiring signals “up” or “down” are two more than the opposite requiring signals “down”

or “up”. Fig. 4.14 shows the schematic diagram of the voter for “up” signals. When up requiring signals are more than down requiring signals, turn on NMOSs are more than turn on PMOSs and the bias voltage at node N1 would be lower than the threshold voltage of the inverter and the output “up” signal would be pulled high. On the other hand, when up requiring signals are less than down requiring signals, turn on NMOSs are less than turn on PMOSs and the bias voltage at node N1 would be higher than the threshold voltage of the inverter and the output “up” signal would be pulled low. An extra PMOS, Mpex, is added before the inverter. The Mpex is always turned on to make sure the output “up” signal wouldn’t be pulled high when up requiring signals are just one more than down requiring signals.

After the “up” or “down” requirement passes the voter the shifting requirement would still not be accepted immediately. To avoid the wrong requirement induce by jitters more completely the “up” or “down” requirement must pass another test circuit, a DLPF (digital low pass filter). DLPF is a finite state machine. Fig. 4.15 shows the state diagram of the DLPF. The output “UP” would be pulled high only when the up requiring keeps in high over three clock periods.

The shifting requirement “UP” or “DOWN” from the DLPF would be sent into the shift selector. Fig. 4.16 shows the schematic diagram of the shift selector. When the “UP”

signal is high, in next clock period the value of Sn would become the value of the Sn–1 in this clock period. When the “DOWN” signal is high, in next clock period the value of Sn would become the value of the Sn+1 in this clock period. Fig. 4.17 shows the shift selector and the delay selector. “S0, S1, and S2” are the output selecting signals of the shift selector, and “S0, S1, and S2” would always be reset as “1, 0, and 0 ” as the beginning. Thus, there are always only one “1” signal in these selecting signals “S0, S1, and S2”. According to the shift

requirement signal “UP” or “DOWN” the “1” signal would shift up or down in “S0, S1, and S2”. The location of the “1” signal in “S0, S1, and S2” decides which delayed data stream in the delay selector is selected to be sampled. If a “UP” requirement were accepted the delayed time of the selected data stream would be increase one phase time in next clock period. If a “DOWN” requirement were accepted the delayed time of the selected data stream would be decrease one phase time in next clock period. The motion of shifting delay time would occur until the sampling edge locates away the data translating edge and no more shift requirement signal “UP” or “DOWN” happens.

4.5 P

HASE

L

OCK

L

OOP

(PLL)

Fig. 4.18 shows the architecture of the designed PLL [12]. A PLL is basically an oscillator, which can provide clocks in different phase. As the name, phase lock loop, one of these clocks provided by PLL can be locked to the input clock with the same phase and frequency. A negative feedback loop is used in PLL as shown in Fig. 4.18. The PLL is composed of a PFD (phase frequency detector), a charge pump, a loop filter, a VCO (voltage-controlled oscillator), and a differential-to-single-ended conversion circuit. The PFD is used to detect the phase/frequency different between the input clock and the output clock. The charge pump is used to charge or discharge the capacitance of the loop filter according to the output signal of the PFD. The bias generator provides two property bias voltages, Vbp and Vbn, which is depend on the output voltage of the loop filter, Vctrl. Vbp and Vbn would be sent into the VCO, and these output clocks of the VCO would change their phase/frequency according to the voltage of Vbp and Vbn until there is no phase delay between the input clock and the output clock of the VCO. Following is the detail description of each circuit block in the designed PLL.

4.5.1 Phase Frequency Detector

Fig. 4.19 shows the schematic diagram of the phase frequency detector. PFD is used to detect the phase error between the input clock and the feedback clock. The TSPC-DFF would pull output signal high when the rise edge happen in the input signal. If the rise edge in the input clock happens before the rise edge in the feedback clock, the “up” signal would be pulled high until the rise edge in the feedback clock happens and trigger the reset signal reset these two TSPC-DFF. On the other hand, if the rise edge in the feedback clock happens before the rise edge in the input clock, the “down” signal would be pulled high until the rise edge in the input clock happens and trigger the reset signal reset these two TSPC-DFF. Thus, when the phase error between the input clock and feedback clock happens, the PFD would send out a pulse signal “up” or “down” and the length of the pulse signal is depend on the distance of the phase difference between the input clock and feedback clock. Ideally, the PFD should have the ability to distinguish any phase error between the input clock and the feedback clock. In practical, when the phase error is too small, the reset signal is so fast that the following charge pump circuit will not be activated and that will result in dead zone, undetectable phase difference range. To eliminate the dead zone a delay buffer is added in the reset path.

4.5.2 Charge Pump and Loop Filter

Fig. 4.20 shows the schematic diagram of the charge pump [12]. The charge pump is used to charge or discharge the loop filter to control the center frequency of the VCO according to the “up” and “down” signals from the PFD. A second-order on-chip loop filter is designed to suppress the reference [13]. As shown in Fig. 4.21 the loop filter is composed of a resistor R1, a capacitor C1 and a capacitor C2. The loop filter provides a pole in the original to provide an infinite DC gain to get the zero static phase error, and a zero in the open loop response in order to improve the phase margin to ensure overall stability of the

loop. C2 is used to provide higher-order roll off for reducing the ripple noise to mitigate frequency jump. However, C2 would also make the overall PLL system become third-order

loop. C2 is used to provide higher-order roll off for reducing the ripple noise to mitigate frequency jump. However, C2 would also make the overall PLL system become third-order

相關文件