Chapter 2 Clock and Data Recovery Architectures
2.3 Analysis of Loop Performance
2.3.1 approximated frequency response with Loop filter
In contrast to the approximated analysis above, the other popular method to analysis a CDR is by the closed-loop transfer function which is written is written is in Eq. (2.8) and loop filter transfer function is approximated to
( )
p
z
n K ω
ω = ⋅ (2.11)
The damping factor and natural frequency characterize the close-loop response. The close-loop frequency response of the CDR for different values of damping factor are normalized to natural frequency as shown in Figure 2-10. This figure shows that the CDR is a low-pass filter to the phase noise at frequency below ωn. For small value of ξ, the curve is shaper than those of large value of ξ. In the CDR design, the loop is designed to be over-damping(ξ>1) to avoid the jitter peaking effect. This also helps increase the phase margin of the open-loop transfer function.
Magnitude(db)
Frequency(Hz)
Figure 2-10 The close-loop frequency response of the CDR
Figure 2-11 shows the transient step response of the CDR for different value of damping factor and for time normalized to
ωn
1 . The step response is generated by
instantaneously advancing the phase of the input by one radian and observing the output for different damping levels in the time domain. The CDR output initially responses rapidly but takes a long time to the steady state for the damping factor larger than one; i.e., the system is over-damped. We can find that the rate of the initial
CDR design.
S(t)
Time ( 10us)
Figure 2-11 The close-loop transient step response of a CDR
Chapter 3
10 Gb/s Clock and Data Recovery Circuit Design
3.1 Introduction
This Chapter discusses the circuits design more detailed in transistor level. The design method could be applied to a CDR with the input data rate of 10 Gb/s. The process and model used for the circuit design is the TSMC 0.18 µm 1P6M CMOS process. We simulate the CDR with HSPICE to acquire the detail electrical behavior.
Moreover, the process variations and the temperature effects should be taken into account. We must simulate the CDR in high temperature, slow process and fast process besides the normal temperature and typical process. After that, the post-layout simulation including the circuit parasitic resistances and capacitances must be simulated.
3.2 Circuit Description
For the CDR circuit to handle high-frequency signals, the circuit must have fast switching speed. ICs operating at speeds greater than 10 Gb/s usually use GaAs MESFETs, GaAs HBTs, Si BiCMOS transistors. The power consumption of these processes, however, is relatively large because their supply voltage is high and their
and highly integrated for low cost. The CMOS transistors have the advantages of low power consumption and low cost, but they still rarely been used in high-speed systems because of their operation speed is too low.
3.2.1 High speed MCML Latch
Generally, a conventional CMOS inverter exhibits some drawbacks that prevent it from being vastly used in high-speed low voltage circuits. First, a CMOS inverter is essentially a single-ended circuit. In a multi-gigahertz frequency range, the short on-chip wires act as coupled transmission lines. The electromagnetic coupling thus causes serious operational malfunctioning in the circuits, particularly single-ended circuits. Beside, the pMOS transistor in a static CMOS inverter will severely limit the maximum operating frequency of the circuit. For circuit can correctly operate at 10GHz domain, we use the MOS current-mode logic (MCML) to take the place of conventional CMOS logic. MCML circuits can operate with lower signal voltage and higher operating frequency at lower supply voltage than static CMOS circuits. The MCML has extensively used to implement ultrahigh-speed buffers [13], [14], latches [14], multiplexers and demultiplexers [15], and frequency dividers [16].
Figure 3-1 shows inverters of the CMOS logic and the conventional MCML. The CMOS logic has the advantage of low power consumption, but its operation is relative slow. For example, the maximum toggle frequency of a conventional 0.18 μm CMOS inverter is only about 3.5 GHz. The power consumption of this CMOS logic is the product of the operation frequency and the charging and discharging power per unit switching. On the other hand, the power consumption of the MCML is the drain current of the current source transistor MNb. Therefore, the power consumption of the MCML is nearly independent of the operation frequency. The CMOS logic uses power only when charging and discharging, its power consumption is generally
smaller than that of the MCML. However, in the gigahertz frequency range, the power consumption of the CMOS logic become larger than that of the MCML, as shown in Figure 3-2 [15]. This means that the MCML is more suitable for low-power operation in the gigahertz frequency range.
Figure 3-1 Inverter circuits of the (a) CMOS logic and (b) MCML
Figure 3-2 Power consumption of the MCML and CMOS logic
A MCML latch consists of an input tracking stage, MN1 and MN2, utilized to
MN4, being employed to store the data. Figure 3-1 demonstrates a CMOS CML latch circuit.
Figure 3-3 Circuit schematic of a CMOS CML latch
The track and latch modes are determined by the clock signal inputs to a second differential pair, MN5 and MN6. When the signal CK is “high”, the tail current Iss entirely flows to the tracking circuit, MN5 and MN6, thereby allowing Vout to track Vin. In the latch-mode, the signal CK goes low, the tracking stage is disabled, whereas the latch pair is enabled storing the logic state at the output.
To achieve the best performance in a MCML latch, a complete current switching must take place, and the current produced by the tail current needs to flow through the ON branch only. So the latch output voltage swing of single end is RD⋅ISS (RD is the equivalent resistance of MP1 or MP2 when it is worked at linear region). The value of load resistance depends on both the tail current and voltage swing requirement.
Reducing the load resistance RD and increasing tail current Iss is one way to lower the transient time without changing the output swing voltage, but increasing the tail current will also increase the power consuming. The sizes of pair transistors (MN1-MN2 and MN3-MN4) are increased by increasing the tail current and also are
increased by reducing the magnitude of single ended swing voltage. The sizes of the other transistors (MN5, MN6 and MNb) have the effect of the tail current only. For high speed switching between MN1-MN2 and MN5-MN6, we let the size of the transistor as large as possible. On the other hand, the increasing of the transistor size will make the parasitic capacitance serious. The parasitic capacitance will slow the operation speed of the circuit. We must get the best balance between the transistor size and the parasitic capacitance for circuit performance.
Unfortunately, the transient output of MCML latch signals has serious level fluctuation as show in Figure 3-4 with a 10Gb/s input data stream. It comes from the clock signal changes. During each transition from the sampling mode ( CK is high ) to the latching mode ( CK is low ), the current tail of the cross-coupled pair must first recharge the capacitances of the cross-coupled pair as it start drawing current from the output nodes, and changing the logic state. Consequently, the output nodes of the MCML latch generate current spiking resulting in the large fluctuation of the output nodes that can yield operation failure at high speed application. In the design of latch, the problem of the serious output fluctuation becomes more serious as lowering the voltage swing. It is a kind of the barrier disturbing high speed gate design[17][18].
3.2.2 Improved High speed MCML Latch
In the previous, the MCML latch has serious output fluctuation. It comes from the variation of the tail current density depending on level change of clock and input data signal though the output nodes are keeping its level unchanged. This typed of latch cannot avoid it. To reduce this problem, we propose a simple approach to improve the MCML latch. The proposed circuit is showed in Figure 3-5. We add an nMOS MN7 as another current source only for MN3-MN4 cross-couple pair. The added MN7 makes the cross-couple pair always on, so when the latch changes form sample mode to the latch mode the tail current Iss does not need to recharge the capacitances of the cross-coupled pair. This can reduce the current spiking of the output nodes reducing the fluctuation. Because the cross-couple pair is always on, the sample pair, MN1-MN2, must have larger current than ID to change the state of the cross-couple pair to trace the input data in sample mode. Figure 3-6 shows the simulation results of the improved MCML latch with a 10Gb/s input data stream.
Figure 3-5 The improved MCML latch
Figure 3-6 Voltage waveforms at the output of the improved MCML Latch
3.2.3 Alexander Phase Detector
The Alexander phase detector is composed of four D_type Flip-Flops and two XOR gates.
3.2.3.1 D_type Flip-Flop
In the previous chapter, an Alexander phase detector use the D Flip-Flops to sample the input data or delay the sampling result by a clock or half a clock. For correct operation at 10GHz domain, we use the improved MCML architecture which is proposed in last section to realize the high speed D Flip-Flop. Figure 3-7 shows the Master-Slave D type Flip-Flop. When clock is “high”, the master latch holds the sampling result for half a clock and the slave latch samples the master output. Any changes at input nodes can not influence the output. When clock is “low”, the slave latch holds the sampling state of the sampling pair for another half a clock. The sampling result at clock rising edge holds for a clock period. Because the CDR input data is random binary sequence, we input a rate of 10Gb/s 27-1 PRBS pattern to the D Flip-Flop with a clock signal that is locked to the input pattern. The performance
showed in Figure 3-8(a) (b). We make eye diagram of the D Flip-Flop output to be more clear about the performance of the MCML latch and the improved MCML latch.
Table 3-1 shows the post simulation result of the two type D Flip-Flops.
Figure 3-7 The Master-Slave D Flip-Flop
Figure 3-8(a) The Eye diagram of the D Flip-Flop
Figure 3-8(b) The Eye diagram of the D Flip-Flop with improved MCML latch
MCML D Flip-Flop
Improved MCML D Flip-Flop
Improved performance
Jitter (pk-pk) 3.1 ps 0.7 ps 343%
Eye opening (mv) 368 mv 488 mv 33%
Output fluctuation 137 mv 71 mv 93%
Table 3-1 Post simulation result of the two type D Flip-Flops
3.2.3.2 XOR gate
Figure 3-9 shows the common schematic of MCML XOR gate[19]. The circuit has a similar structure like latch. However, it has two symmetric differential pairs, MN1-MN2 and MN3-MN4, which deal with same input signal as Α−Α_ pair. When both A and B have the same logic level, logic high or low, either MN1-MN5 or MN3-MN6 turns on. Consequently, the output node, Q, goes logically low. When two input signals have different logic level, one is high and the other is low, either MN2-MN5 or MN4-MN6 turns on. The output node, Q− , goes logically low.
It is important to note that the XOR gates in Figure 2-3 must provide the two different inputs with symmetric load. Otherwise, differences in propagation delays result in systematic phase offsets. To reducing the unbalance effect of the load, each of the XOR gates is implemented as shown in Figure 3-10[20]. The circuit does not use the stacking stages, so it provides perfect symmetry between the two inputs. The output is single-ended but the single-ended “early” and “late” signals produced by the two XOR gates in the phase detector are sensed with respect to each other, thus acting as a differential drive for the Voltage-to-Current converter. The operation of the XOR circuit is as follows. We set the Vref at the output common-mode level of the D Flip-Flop preceding the XOR gate. If the two inputs are identical, one of the tail currents flows through the transistor MP and the output voltage is low. If the two logical inputs are not equal, then one of the input transistors on the left and one of the input transistors on the right turns on, thus turning the transistor MP off and the output voltage is high.
Figure 3-10 Symmetric XOR gate
Figure 3-11 shows the post simulation result of the Alexander phase detector’s transfer function which is composed of the proposed structure above. This is accomplished by obtaining the difference value of lead and lag signals with 10Gb/s input data rate. The PD zero output voltage at a phase difference is approximately 6.5ps (0.13π) from the metastable point, indicating that the systematic offset between the data and the clock is very small.
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
-30 -20 -10 0 10 20 30
PD output voltage (mv)
Phase difference(π)
Figure 3-11 The transfer function of the Alexander PD
3.2.4 V/I converter and Loop filter
Figure 3-12 shows the V/I converter and the loop filter. The V/I converter compares the output voltage of the two XOR gates of the Alexander phase detector and converts this voltage difference to current output. The output current will charge or discharge the loop filter to tune the VCO controlling voltage. Unlike charge pumps, V/I converters need not to switch after every phase comparison. Therefore, it does not suffer from the dead-zone issue. Without the stacking switch MOS, the V/I converter can provide nearly rail-to-rail voltage swings for the oscillator control line[20][21].
Although the Bang-Bang CDR loop is in general a nonlinear time-variant system, it can only be assumed linear if the phase error is small. The design of the loop filter is based on a linear time-invariant model of the loop and is performed in continuous time domain.
Figure 3-12 V/I converter and loop filter 3.2.5 VCO
A voltage control oscillator (VCO) is the most sensitive building block in a CDR as far as supply and substrate noise is concerned. Therefore, careful design is needed in order to reduce noise and frequency drift. Although ring oscillator has wide tuning range and excellence of integration with digital CMOS process, it can not accomplish high-frequency operation, such as 10GHz or higher. To accomplish high-frequency operation, we use the LC-tank oscillator. The LC-tank oscillator has an excellent phase noise performance with low power consumption because of a relatively high quality factor. This high-speed component was realized in expensive technologies such as GaAs, SiGe or bipolar before. Now the low cost of CMOS technology due to its paramount maturity and high integration density has pushed the designers to realize low noise VCO for CMOS system on chip[1][22].
As shown in Figure 3-13[23], the complementary cross-coupled differential LC structure was used to realize the fully integrated 10GHz domain low noise and low
power consumption oscillator under 1.8V supply voltage. Fully differential operation provides complementary outputs. The using of pMOS and nMOS structure offers higher transconductance for a given current, which results in saving power and faster switching of the cross-coupled differential pair[1][24].
Figure 3-13 The complementary cross-coupled differential LC VCO
The oscillation frequency of LC topologies is equal to fosc =1
(
2π LC)
, suggesting that only the inductor and capacitor values can be varied to tune the frequency and other parameters such as bias currents and transistor transconductances affect negligibly. Since it is difficult to vary the value of monolithic inductors, we simply change the tank capacitance to tune the oscillator. The tunable capacitance is called “varacor”. In circuit design, a reverse-biased pn junction or a MOSFET can serve as a varactor. The MOSFET varactor suffers from a large source-drain resistance in the vicinity of minimum capacitance due to the low carrier concentration in thefosc
varactor” resolves these difficulties. Figure 3-14 shows the structure of pn junction and accumulation-mode MOS varactor[1].
p
+n
+Figure 3-14 Structure of varactor (a) pn junction (b) accumulation-mode MOS
While used in both bipolar and CMOS technologies, the pn junction varactors become less attractive at low supply voltage VCO design for two reasons. First, pn junctions suffer from a limited tuning range that trades with nonlinearity in the C-V characteristic. The junction capacitance can express as
m
where Co is the zero-bias value, VR the reverse-bias voltage, ΦB the built-in potential of the junction, and m a value typically between 0.3 and 0.4. At low supply voltages VR has a very limited range, yielding a small range for Cvar and hence for fosc. The capacitance varies slowly under reverse bias and sharply under forward bias, thereby introducing significant nonlinearity in the VCO characteristic. Second, at low supply voltages, it becomes increasingly more difficult to select the oscillator common-mode level and signal swings so as to avoid forward biasing the diodes. The accumulation-mode MOS varactor does not exhibit the above shortcomings. The C-V characteristic is illustrated in Figure 3-15. The MOS varactor should operate with
positive and negative biases so as to provide maximum dynamic range, Cmax/Cmin, of 2.5 to 3 with . This device comfortably tolerates both positive and negative voltages, allowing large VCO swings.
V
Figure 3-15 The C-V characteristic of accumulation-mode MOS varactor
Monolithic inductors are typically realized as spiral structures. The mutual coupling between every two turns results in a relatively large inductance per unit area.
Figure 3-16(a) shows the symmetric spiral structure and figure 3-16(b) shows the equivalent circuit model[25][1].
Figure 3-16 (a) Symmetric spiral inductor (b) circuit model of inductor
The definition of each parameter is listed below:
L : inductance
Rdc : metal series resistance
Cs : overlap capacitance between the spiral and the center tap under pass Cox : oxide capacitance between the spiral and substrate
Rsub : silicon substrate resistance Csub : silicon substrate capacitance
For a given inductance, different combinations of line width, number of turns, and outer dimension can be used, leading to a large design space. However, the dc resistance of the inductor, Rdc, often constrains the choice of these values. In particular, the line must be sufficiently wide so that Rdc does not significantly limit the Q.
Nevertheless, increasing W yields a greater area and a larger capacitance for the inductor. These will increase the loss of inductance. At high frequency, the series resistance of the wire is also influenced by the skin effect heavily. Interestingly, such current distribution also changes the inductance value because the area and hence the magnetic flux enclosed by each turn change[25].
At high frequency the passive components is a very important part, because they compose the core of VCO, the resonant tank. High quality factors and low parasitics are necessary for both inductor and varactor in respect of the phase noise, tuning range and power consumption. Thus, accurate modeling, especially at frequencies above 10GHz, may require electromagnetic field simulations. Because we do not have any experience of making an on-chip passive components, we make use of TSMC 0.18μm RFIC 1P6M+ process which provides electrical behavior characteristics of spiral inductor and varactor for design reference. The operation speed follows the synchronous optical network (SONET) OC-192 at the forward error-correction(FEC) bit rates of 10.71Gb/s[26]. The post-layout simulation transfer curve of VCO is show in Figure 3-17. Table 3-2 collates the simulation result.