• 沒有找到結果。

Chapter 4 Serial ATA Transmitter Circuit Design

4.2 Transmitter Design Issues

Fig. 4.1 Transceiver architecture

Fig. 4.1 shows the high-speed link transceiver architecture. The chip package interface, termination resistance and cable are also included in the figure. Cin is the internal pad parasitic capacitor and Cout is the loading of PCB and package. L is the bonding wore between chip and PCB. The characteristic impedance of cable equals to Rterm and is denoted as R. In our circuit, we use typical value of Rterm=R=50Ω, Cin=0.5pF, Cout=1pF and L=2nH for simulation. The frequency response of the package is shown in Fig. 4.2, the f-3dB=5Ghz. In this dissertation, the RX and CDR are not included in the simulation. Fig. 4.3 shows the frequency response of coaxial cable with the HSPICE model and with measurement result. We use RG233/U as the coaxial cable with characteristic impedance is 50Ω.

Fig. 4.2 Frequency response of package

Fig. 4.3 Frequency response of coaxial cable

4.3 Circuit Design and Simulation Results

This section describes the circuit design of the proposed transmitter and their simulation results. The simulation results include the parasitic of the package, and the

f

-3dB

=5GHz

4.3.1 Input Data

The input data is from PRBS or K28.5 pattern. The PRBS is built on chip and generates random pattern automatically by the registor and XOR circuit shown in Fig.

4.4. In order to implement ten parallel inputs, a 210-1 data pattern is generated with 10 registers and XOR circuits. We design the ten bits PRBS encoder with each 600MHz data rate to get a 6Gbps of system operational speed. The control signal Vsela provides a pulse (logic one) to restart the circuit to generate the parallel data. A 600MHz clock is also needed to trigger the register. The simulation result is shown in Fig. 4.5.

Pin1 Pin2 Pin3 Pin4 Pin5 Pin6 Pin7 Pin8 Pin9 Pin10

Fig. 4.4 10-bit PRBS encoder

The K28.5 is a worst case regular pattern generated to verify if the circuit can transmit the data correctly or not and if the RX eye can fit the SATA specification or not. The K28.5 pattern has two sequences (composed of alternating K28.5+ and K28.5-), the positive disparity (0011111010) and the negative disparity (1100000101).

These two sequences form the symbols 00111110101100000101. These long symbols contain five consecutive 1's and five consecutive 0's, (the longest DC data). It also contains an isolated 1-010-and an isolated 0-101, (the high speed AC transition). Thus, we implement this circuit by the XOR circuit. With input VDD/GND and 300MHz

clock, we can generate the positive or negative disparity. The simulation result is shown in Fig. 4.6.

Pin1 Pin2 Pin3

Pin4 Pin5 Pin6 Pin7 Pin8 Pin9 Pin10

Fig. 4.5 Simulation waveform with 10-bit PRBS encoder

Tx_out

0 0 1 1 1 1 1 0 1 0 1 1 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 0 1 0 1 1 0 0 0 0 0 1 0 1

Time (ns)

Voltage(v)

Fig. 4.6 Simulation waveform with K28.5 input pattern

Voltage(v)

Fig. 4.7 Simulation waveform with K28.5 and PRBS input patterns

We simulate the two kinds of input pattern shown in Fig. 4.7, and we measure the input data from the transmitter output to verify the K28.5 pattern can be correctly sampled and transferred. The figure also shows the single and differential serial data.

The beginning 9ns is K28.5 pattern and then we changed into PRBS pattern. We can find out that K28.5 is a fixed and repeated pattern, and the PRBS is a random input pattern. Therefore, we can select these two kinds of input pattern any time we want.

4.3.2 Synchronizer

In order to sample each five parallel data, the data should be skewed to fit the sampling time margin of PISO circuit. Thus, the synchronization circuit is design by register with multi-phase clocks as shown in Fig. 4.8. In our design, we use True Single Phase Circuit (TSPC) type DFF as the register to enhance the speed when implementing the skewed data. The register are triggered by the five 1.2GHz clock phase and the phase offset is 166ps. The input data B1~B5 are sampled and stored in the registers, and then sequentially transferred into PISO circuit with different clock

phase. With consideration of setup and hold time of registers, we all use clk1 to sample data in the first stage registers, and the second stage registers sample the data with successive clocks. Thus, we choose clk3, clk4, clk5, clk1 and clk4 so that the registers can finish the operation in the correct period interval and data can be sampled correctly as shown in Fig. 3.3..

B5

Fig. 4.8 5-bits data synchronizer

4.3.3 PISO

The PISO architecture is shown in Fig. 4.9 and each logic driver circuit is described in Fig. 4.10. It is composed of five pairs of differential paths with a PMOS added in each path to pre-charge the internal node to high level before logic high is sent out and with an active load NMOS to enhance the circuit bandwidth. For an example, we employ the difference between clock phase1 and phase2b to transmit one branch of data shown in our logic driver. However, the driver may suffer some charge sharing effect shown in Fig. 4.10. The effect includes two kinds of source. The first one is to pull down the output signal level when M1 is turned on. The other one will

turned on. To reduce these two effects, we add a clk5 to control the turn-on of M2. In this way, M2 and M1 will turn on simultaneously. So the two effects will cancel each other. Therefore the serial data can be correctly transmitted into CML driver. Fig. 4.11 shows the simulation results of the output driver without and with charge sharing compensation.

Fig. 4.9 PISO circuit

Fig. 4.10 Logic drive block of PISO

(a) (b)

Fig. 4.11 Charge sharing effect [14] (a) without charge compensation (b)with charge compensation

4.3.4 Pre-emphasis and Data Driver

Fig. 4.12 shows our pre-emphasis architecture. According to different data rate, we can select half bit time and 1 bit time pre-emphasis by CML MUX circuit. For half bit delay time 83ps, we generate this delay by using two inverters. This kind of pre-emphasis amount is fixed for different data rate. Thus, we can guarantee that the transmitter circuit won’t transmit data with over pre-emphasis amount. For 1 bit delay time 166ps, we implement this method by design a replicate circuit which has PISO and tap buffer. When circuit operated in the high data rate and we want to get a perfect eye diagram in the receiver end. We use CML MUX to select this mode to meet with the SATA specification.

Fig. 4.12 Pre-emphasis and data driver architecture

FDT pre-emphasis: We implement this method by inserting two CML buffer in the pre-emphasis circuit. Each buffer has 30ps delay time shown in Fig. 4.13 (a) (b).

In order to realize the tunable delay time mechanism, we use a different type of circuit shown in Fig. 4.13 (a) [15]. By inserting a negative resistance which is a positive feedback circuit to the output, this circuit can produces a delay range from output to input by changing the equivalent R to alter the RC-delay. We control the value of the equivalent resistance by the voltage VB1 and VB2. The differential control voltage VB1 – VB2 divides the total bias current between the input differential pair and negative resistance pair. If VB1 » VB2, the delay time is minimized such as a typical CML Buffer and the output swing remains constant as the total bias current through the output resistors is fixed. The required tuning range of the tunable CML buffer depends on the two things which are desired symbol length and the necessary duty-cycle range for pre-emphasis. Therefore, we design the tunable CML buffer to

produce 30~70ps delay range. Therefore, FDT pre-emphasis can be tuned from 80 to 120 delay time.

(a) (b)

Fig. 4.13 FDT Pre-emphasis detail circuit (a) Tunable CML buffer (b) CML buffer OBT pre-emphasis:We implement this method by inserting a replica circuit such as PISO and data driver. The PISO circuit is described in the previous section and the CML buffer is the same as Fig. 4.13(b). OBT pre-emphasis can precisely generate one bit delay time under different data rate.

Either FDT or OBT pre-emphasis can be selected by CML MUX. The circuit is shown in Fig. 4.14 (a) [22]. It like two CML buffers which share the same current source. We use Vb1 and Vb2 to select one of the two pre-emphasis. For example, when Vb1 is high and Vb2 is low, the FDT pre-emphasis is selected and the left path of the CML MUX is turned on to transmit data into CML. At this moment, the other pre-emphasis can be turned off to save power consumption. Fig. 4.14 (b) is CML circuit which we have described in the previous chapter. We can control the CML current source of main driver and tap1 to enlarge the TX/RX eye diagram and make the constant receiver signal amplitude to fit the SATA specification. The overall circuit simulation results are described in the next section.

(a) (b)

Fig. 4.14 CML MUX and CML driver circuit (a) CML MUX (b) CML Driver

4.4 Simulation Results

Fig. 4.15 shows the output node name of transmitter The termination of transmitter and receiver is 50Ω. We use W model in HSPICE to simulate the RG233/U cable model and we can modify the cable length for different case. The following simulated diagrams are the post layout simulation results under 6Gbps data rate.

Ω

Ω Ω

Ω

Ω

Ω

Fig. 4.15 The simple diagram of TX and RX

We design the input voltage of pre-driver and pre-emphasis to be 1.8V to 0.8v as shown in Fig. 4.16, the jitter are 10ps with a large input swing. This input signal can provide CML data driver a correct and large input swing to enhance the signal quality.

Voltage(v)

Fig. 4.16 The simulation results of Vin+/- and CML+/- node

Fig. 4.17 shows the simulation results of RX_out node under 1m cable length.

The jitter is about 10ps and the amplitude is 600mV.

Time (s) 10ps

Fig. 4.17 The eye diagram of receiver in RX_out node (cable length=1m)

4.4.1 FDT and OBT pre-emphasis comparison

In this section, we set the delay time of FDT pre-emphasis to half bit time (83ps) for 6Gbps. Then we can compare the two kinds of pre-emphasis. Fig. 4.18 shows the post-layout simulation waveform with FDT and OBT pre-emphasis in CML and ECML node. In Fig. 4.18 (a), the differential voltage of ECML node delays half bit

half bit time after CML node. Therefore, we have two kinds of pre-emphasis which are half bit time and one bit time pre-emphasis under 6Gbps data rate.

(a) (b)

Fig. 4.18 The simulation results of CML and ECML node (a) FDT (b) OBT Fig. 4.19 shows the post-layout simulation waveform with FDT and OBT pre-emphasis in RX node under 6Gbps data rate with 5M cable. We can find out that FDT compensate less pre-emphasis amount under high speed data rate. Therefore, in this case we shall choose one bit time pre-emphasis for transmitting data. However, when the data rate is down to 3Gbps as shown in Fig. 4.20, we find out that the OBT pre-emphasis will cause 230mV overshoot in the receiver eye diagram and this overshoot will enlarge the jitter in the receiver eye diagram. Both of the amplitudes are 630mV and jitters in FDT/OBT pre-emphasis are 18/16ps. Thus, in 3Gbps data rate, we choose FDT pre-emphasis to improve the signal quality in the receiver end.

The comparison of these two pre-emphasis is summed in Table 4.1. We compare the power, jitter and eye amplitude with different data rate.

CML

Fig. 4.19 RX-eye with fixed and 1 bit delay time pre-emphasis under 6Gbps

Fig. 4.20 RX-eye with fixed and 1 bit delay time pre-emphasis under 3Gbps

Table 4.1 Pre-emphasis performance comparison (Cable length =5M)

Data rate 6Gbps 3Gbps

Transmitter swing 700mV 700mV

Receiver swing 550mV 630mv

Receiver jitter (FDT/OBT) 27/14ps 18/16ps Total power (with FDT) 105mW 90mW Total power (with OBT) 115mW 100mW

We can find out that the receiver jitter in FDT is always larger than in OBT. This

is because the implementation of the tunable methodology. In order to investigate the different delay time affect the pre-emphasis amount, we have to use the inverter train type pre-emphasis to implement the tunable mechanism. For a low jitter FDT pre-emphasis, we can use the methodology like OBT type. This can be done by inserting replica circuit of PISO and data driver, then data are sampled by different phase to implement fixed amount pre-emphasis. The RX eye is simulated in the Fig.

4.21. The jitter is reduced to 14ps and the swing is the same as FDT pre-emphasis.

Fig. 4.21 RX-eye with modified half bit delay time pre-emphasis under 3Gbps

Fig. 4.22 shows the waveform with and without pre-emphasis in the RX node under 10m cable. Obviously, the amplitude without pre-emphasis (160mV) is smaller than 200 mV and does not fit with the SATA eye mask. With our proposed pre-emphasis, the output signal amplitude is enlarged and fit with the SATA specification. In this case, FDT/OBT pre-emphasis can enlarge the magnitude by 300/410mV. Therefore, our proposed transmitter can transmit data under 1m~ 10m cable.

14ps

Fig. 4.22 The eye diagram of receiver (cable length=10m) Table 4.2 Simulation results summary of the transmitter

Technology 0.18 um 1P6M CMOS

Supply voltage 1.8V

Clock rate 1.2GHz (5 phase)

Data rate 6 Gbps

Cable lenth 1~10 公尺

Transmitter output swing 700mV

Receiver input swing 550mV

Total power (without pre-emphasis) 72 mW

Total power

(with FDT pre-emphasis ) 105 mW

Total power

(with OBT pre-emphasis ) 115 mW

Table 4.3 shows the comparison between our design and other papers [12] [18]

[20] [21]. When compared with the paper of JSSCC 2004, our circuit consumes lower power to achieve high data rate and high output voltage swing. When compared with

other papers, we can find that our transmitter can save more powers when operated at high data rate and high voltage swing.

Table 4.3 Comparisons of the performance between our design and other papers

Our Work

JSSC

In this chapter, we describe the detail circuit of the proposed transmitter. The simulation results are also shown in this chapter. We use PRBS and K28.5 as the input pattern to test and verify with the SATA specification. The simulation result of TX and Rx eye diagram are met with the specifications. The OBT and FDT pre-emphasis are also discussed and simulated in this chapter. The length of the coaxial cable can be in the range of 1 meter to 10 meter and for data rate of 3Gbps and 6Gbps. In the next chapter, we will describe the experimental considerations and show the measurement results of our proposed transmitter.

Chapter 5

Experimental Results

5.1 Layout and Experimental Setup

The transmitter chip is implemented in TSMC 0.18um 4P6M CMOS process. Fig.

5-1 shows the layout of the transmitter. The area of the chip (including the bonding pads) is 1160 x 1280 um2. The area of transmitter (without PLL) is 630 x 320 um2. The power supply noise would be the major concern in the layout. Therefore, we separate the transmitter circuit as three parts, PLL, digital of TX and analog of TX.

The power lines of the three parts are independent. Double guardrings are placed in every blocks of the circuit to reduce the substrate noise. The decoupling capacitance is also placed as much as we can to stabilize the power line and reference voltage around the circuit. The phase generated from PLL to transmitter is the most sensitive metal line in the layout. The distance and length of each phase is treated in the same condition. We also insert the buffers du ing the long distance of metal line to enhance the driving capability.

Fig. 5.1 shows the layout view of the transmitter with the major functional blocks.

There are transmitter with pre-emphasis, SSCG based on PLL and adaptive termination. The number of total pads is 38. (Including transmitter, adaptive termination, SSCG and decoupling capatiance)

Transmitter

Fig. 5.1 Transmitter layout

The experimental setup for transmitter is shown in Fig. 5.2. First, we use Pulse Pattern Generator (Agilent 81130A 660MHz) to generate the reference clock of PLL.

Then, the PLL output clock signal is fed to Digital Storage Oscilloscope (Tektronix TDS6124C 12GHz) to view the waveform. We can also use the PSA series spectrum Analyzer (Aglient E4440A 26GHz) to measure the spectrum of the PLL and the TX.

The differential outputs of TX and the eye diagram are measured through the Wide Bandwidth Oscilloscope (Aglient 86100B 20GHz). Thus, we can verify the accuracy of the input pattern.

K28.5

TX

Agilent E4440A PSA Series Spectrum Agilent E4440A PSA Series Spectrum

Tektronix TDS9124C

Fig. 5.2 The experimental setup for the transmitter

There are two version of our proposed transmitters. The comparisons of these two versions are shown in Table 5.1. We have received and tested the first version in our laboratory. Thus, the following measurement results in the next section are the first version of our transmitter. Fig. 5.3 shows the micrograph of the first version transmitter.

Table 5.1 Chip version

First Version Second version

Transmitter O O

Pre-emphasis Fixed delay time Pre-emphasis

Tunable delay time Pre-emphasis

PLL O O

Adaptive termination X O

SSCG X O

Fig. 5.3 Transmitter chip micrograph

5.2 Print Circuit Board Setup

The print circuit board (PCB) for measurement is shown in Fig. 5.4. The PCB stack type is FR4. It provides independent VDD and GND layer. Then we can separate the VDD and GND layer into two parts used for analog and digital circuit respectively. Besides, we partition the power supply of transmitter into three parts.

The digital of TX, analog of TX and PLL circuits are supplied by separated power, and we use the regulator to reduce the supply noise and stabilize the voltage. We also insert several capacitors between power supply and circuit in the PCB to decouple both low frequency noise and high frequency noise. Also, the high frequency traces

such as TX differential output are made as short and symmetry as possible. The transmission wire is also designed to be 44mil to match 50Ω termination of cable.

Besides, we place the passive component and termination close to the die to reduce the parasitic effect and signal reflections.

Regulator

Regulator

TX_outb

TX_out

PLL_Ref PLL_out Die

Fig. 5.4 The PCB for testing

5.3 Experimental Results

In our proposed circuit, transmitter needs five phases clock to sample and deliver data. Each phase of PLL should have 100MHz frequency. Fig. 5.5 shows the

measured standard deviation and peak to peak jitter of PLL output signal at 100MHz, which are 3ps and 20ps respectively. Fig. 5.6 shows the timing diagram output clock of PLL at 100MHz. The swing voltage of PLL is 1v.

Fig. 5.5 Jitter histograms of the PLL at 100MHz

Fig. 5.6 Timing diagram of the PLL at 100MHz

Fig. 5.7 and Fig. 5.8 shows the measurement and simulation results of K28.5 pattern at 6Gbps. The patterns in the two figures are the same. Fig. 5.9 and Fig. 5.10 show the measured differential output swing with and without pre-emphasis of the transmitter operating at 6Gbps at receiver end. The input is fed with PRBS pattern. It can shown that the waveform of the transmitter with pre-emphasis have obvious enhance the high frequency component.

Fig. 5.7 Measurement result of K28.5 pattern

00111110101100000101

Fig. 5.8 Simulation result of K28.5 pattern

Fig. 5.9 PRBS waveform without pre-emphasis in 6Gbps

Fig. 5.10 PRBS waveform with pre-emphasis in 6Gbps

Fig. 5.11 and Fig. 5.12 show the measured eye diagram without pre-emphasis and with pre-emphasis of the transmitter operating at 6Gbps at TX node. The test pattern is a 210-1 pseudo-random-bit sequence (PRBS). The figures show that the eye

Fig. 5.11 and Fig. 5.12 show the measured eye diagram without pre-emphasis and with pre-emphasis of the transmitter operating at 6Gbps at TX node. The test pattern is a 210-1 pseudo-random-bit sequence (PRBS). The figures show that the eye

相關文件