Chapter 1 Introduction
1.3 Thesis Organization
The thesis organization is described as follows:
Chapter 2 introduces the specifications of Serial ATA, including the physical plant block diagram, signal specifications and transmitter examples. We also compare the two type drivers which are CML and LVDS in this chapter.
Chapter 3 shows the proposed transmitter architecture and explains the pre-emphasis method to overcome the bandwidth of the line as the length is increased.
Chapter 4 will discuss some detail circuit design methodology of the transmitter.
Also, a novel pre-emphasis circuit is proposed. The implementation of functional block and simulation results of transmitter are also described in this chapter.
Chapter 5 shows the experimental results and introduces the measurement equipment we used to measure the chip. Some measurement results are also shown in this chapter.
Chapter 6: The whole research concluded in this chapter.
Chapter 2 Serial ATA
2.1 Introduction to Serial ATA Specifications
Serial ATA (SATA) [4], an evolutionary high-performance interface for storage devices to replace the Parallel ATA, is used to connect ATA and ATAPI devices. Serial ATA has many advantages as following:
y Point to point connection topology ensures dedicated 1.5Gbits/sec to each device
y Thinner, longer cables for easier routing
y Fewer interface signals require less board space and allow for simpler routing
y Better connector design for easier installation and better device reliability y Hot-swap capability
y First-party DMA support
There are four layers, Application, Transport, Link, and Physical layers in the Serial ATA architecture shows in Fig. 2.1. The Application layer is responsible for overall ATA command execution, including controlling Command Block Register accesses. The Transport layer is responsible for placing control information and data to be transferred between the host and device in a packet/frame, known as a Frame Information Structure (FIS). The Link layer is responsible for taking data from the constructed frames, encoding or decoding each byte using 8b/10b, and inserting control characters such that the 10-bit stream of data may be decoded correctly. The Physical layer is responsible for transmitting and receiving the encoded information as a serial data stream on the wire.
4 Application Layer
3 Transport Layer
2 Link Layer
1 Physical Layer
Fig. 2.1 Serial ATA Communications Layer Model
The target services of physical layer in Serial ATA are listed below. For the transmitter end, the 10, 20, 40, or other width parallel input from the link layer are serialized for transmission. Then the transmitter delivers 1.5, 3 or 6Gbps differential NRZ serial stream data at specified voltage level with 100 Ohm matched termination through cable to receiver. For the receiver end, it receives differential NRZ serial stream with data rates of ± 350 ppm with +0/-500 ppm (due to spread spectrum profile) from the nominal data rate. Then the receiver shall extract data and clock from the serial stream and de-serial the stream data. The transceiver can optionally support
power management modes and impedance calibration in the transceiver. Our research is emphasizing on the Physical layer to design a SATA transmitter. The more detail information about Physical are show in the next section.
2.2 Physical Layer of Serial ATA
The Serial ATA physical layer (PHY) uses low-voltage differential signaling to enable speeds from 1.5Gb/s to 6.0Gb/s. The PHY layer incorporates serializer/deserializer, provides out of band (OOB) signaling, and handles power–on sequencing and speed negotiation. Transmit Data is serialized from 10-bit characters, and Receive Data is deserialized to 10-bit characters. Device status feedback is provided to the link layer. The overall physical block diagram is shown in Fig. 2.2. We focus on the Transmitter end.
Fig. 2.2 Physical Plant Overall Block Diagram [4]
The DATAIN[0:n] from link layer and from 8b/10b coding are usually parallel data stream in order to increase the bandwidth of the link. There are also lots of alternative fixed control pattern sources from control block, mainly to provide the supporting circuitry that generates the patterns as needed to implement the ALIGN primitives activity defined in SATA specification [4]. The control block is a collection of logic circuitry that controls the overall functionality of physical plant circuitry (Table 2.1).
Table 2.1 Signals in Control Block
PHYreset
This input signal causes the physical layer to initialize to a known state and start generating the COMRESET OOB signal across the interface.
PHYready
Signal indicating Physical has successfully established
communications. The Physical is maintaining synchronization with the incoming signal to its receiver and is transmitting a valid signal on its transmitter.
Slumber Causes the physical layer to transition to the Slumber power management state.
Partial Causes the physical layer to transition to the Partial power management state.
NearAFELB Causes the physical layer to loop back the serial data stream from its transmitter to its receiver.
FarAFELB Causes the physical to loop back the serial data stream from its receiver to its transmitter.
SpdSel Causes the control logic to automatically negotiate for a usable interface speed or sets a particular interface speed.
SpdMode Output signal that reflects the current interface speed setting.
System Clock
This input is the clock source for much of the control circuit and is the basis from which the transmitting interface speed is
established.
After that we need a Parallel In Serial Out (PISO) circuit before sending data to the analog front end. The PISO circuit can convert parallel digital bits into a serial analog signal stream with TX clock. Finally, the serial data in the analog front end is delivered by the high speed differential driver through channel to the receiver.
2.3 Transmitter Driver of Serial ATA
There are two transmitter examples from Serial ATA specification shown in Fig.
2.3. We can see both Current Mode Logic (CML) and Low Voltage Differential Signaling (LVDS) circuits can implement the SATA transmitter. The figure also indicates how the transition to and from the idle state can be implemented. When the signal of idle pin is high, it will cause the CML or LVDS input switch MOS to turn off.
Therefore, there will be no data transition on the cable. The termination of the both structure are 100 ohm in order to match the cable equivalent termination. Following is the detail discussion of the two structures.
2.3.1 Low Voltage Differential Signaling (LVDS)
LVDS is a high-speed and low-power general purpose interface standard that solves the bottleneck problems while servicing a wide range of application areas.
There are two industry standard specifications for LVDS, one is ANSI/TIZ/EIA-644 [7] and the other is IEEE 1596.3 SCI-LVDS [8]. They specify a little different output range, but the IEEE 1596.3 only addressed the high data rates and did not address the low power concern. ANSI/TIZ/EIA-644 is the more common standard. It specifies a differential output voltage swing in the interval 250 mV to 450 mV. LVDS basically
specifies circuits with 2.5V or 3.3V power supplies. If we want to implement this standard into TSMC 0.18um technology for our S-ATA data driver, we have to modify some rules.
(a)
(b)
Fig. 2.3 (a) CML Structure (b) LVDS Structure [4]
The LVDS output structure is shown in Fig. 2.4. The LVDS driver and receiver connected via differential impedance media. In order for the differential output to be terminated correctly, a 100 ohm resistor has to be connected between OUT+ and OUT- at the receiver and driver end. The driver consists of a current source which drives the differential pair lines. This current will run through the 100 ohm termination resistor, and the direction of this current will change every time the output shifts from “high” to “low” or the opposite according to the four MOS switch ( M1 -
M4). The direction of the output current is controlled by letting one side of the differential output stage either sink or source current, while the other side of the differential output stage does the opposite (push-pull). Assuming that the current source in the LVDS output stage is 8 mA, this structure yields a single ended voltage swing of 400 mV, and a differential voltage swing of 800 mV.
Vop
Fig. 2.4 LVDS Driver and Receiver Structure
The LVDS input structure has the advantage of a wide input voltage range (0 volt – 1.8 volt), and a low differential input voltage threshold of 100 mV. The wide input voltage range combined with the low differential input voltage threshold, allows for a ± 1 volt difference in ground potential between the LVDS driver and the LVDS receiver.
2.3.2 Current Mode Logic (CML)
Fig. 2.5 shows the CML structure [9][10]. It has the advantage of requiring without any external termination resistors. The termination resistors are an integrated
part of the input and output structure. The CML output stage consists of a differential pair, and the logic function is implemented by shifting the current between the two halves of the differential pair. Assume the current source is 16mA. When one of the differential outputs (OUT+ or OUT- ) is in ”low” state, 8 mA will be drawn from the power supply Vcco and 8 mA will be drawn from the power supply Vcci of the CML input stage to which the differential CML output is connected. The current is drawn equally from the two supplies because the input impedance of the CML input stage is 50 ohm, and thus equal to the resistor in the output stage. Assuming that the current source in the CML output stage is 16 mA, the single ended output “high” voltage is Vcc, and the single ended output “low” voltage is Vcc - 0.4 volt. This yields a single ended voltage swing of 400 mV, and a differential voltage swing of 800 mV.
VIN
Fig. 2.5 CML Driver and Receiver Structure
2.3.3 The Comparison between LVDS and CML
The Low Voltage Differential Signaling (LVDS) structure and the Current Mode Logic (CML) structure have been developed to provide the high-speed and low-power
interface application shown in Fig. 2.6. The target of our research is to design a 6Gbps transmitter for S-ATA III, and both of the structures can operated at high-speed.
Therefore, power, area and switch sensitivity would be our major concern. We discuss these two structures under the same differential output swing condition by using 0.18um 1.8v CMOS process.
(a) (b) Fig. 2.6 Driver architecture (a)LVDS (b)CML
For LVDS architecture shown in Fig. 2.6(a), it has a 2R differential termination between transmitter and receiver. The signal-ended output voltages would be either Vocm+RI or Vocm-RI, and the differential swing is 2RI. For CML architecture shown in Fig. 2.6(b), it consists of an independent parallel termination R between transmitter and receiver, and current 2I is steered either in the left or right transistor. The single-ended output voltage would be either VDD or VDD-2RI, and the differential swing is equal to 2RI. So the power dissipation in CML driver is twice bigger than in LVDS driver. However, the data driver consists of driver and pre-driver. In order to meet LVDS standard, the size of the four input transistors shall be designed very large.
Thus, the sizes of the pre-driver must also be increased to drive the PMOS and NMOS
input transistors. This will cause the total power of LVDS driver plus pre-driver much larger than that of CML.
We use these two kinds of data driver to design our transmitter to fit with the SATA specification shown in Fig. 2.7. Here, we make the output differential swing in the TX node are both 700mV. Thus, the current of LVDS driver is half of CML and the power consumption of LVDS is less than CML. The size of LVDS and CML is listed in Table 2.2. From this table, we can find out that the size of LVDS is larger than CML. Moreover, the voltage swing of LVDS is in the central range, thus we need to use a level shift circuit to shift voltage level such that it can driver the LVDS in the correct function. This will cause more power in the pre-driver stage. Besides, pre-driver need to drive large size of LVDS and this also causes more power.
Therefore, we use four stages pre-driver in the LVDS structure and three stages pre-driver in the CML structure.
Ω
Fig. 2.7 The transceiver structure (a) LVDS structure (b) CML structure
Table 2.3 shows the power comparison of the two drivers when the transmitter
swing is 700mV. From Table 2.3 we can find out that the amount of current source in LVDS structure is half of that in CML structure, so CML structure consumes twice power amount than LVDS. However, the LVDS tap buffer spends more power than CML tap buffer. Therefore, CML transmitter consumes less power than LVDS transmitter.
Table 2.2 The size of the two drivers
(um) LVDS CML MN (6/0.18) , M=12 (6/0.18) , M=12
MP (6/0.18) , M=46 -
Msn (10/0.35) , M=33 (10/0.35) , M=33
Msp (10/0.35) , M=121 -
Table 2.3 The power comparison of the two drivers
LVDS CML
Pre-driver 43.5mW 12.5mW
Driver 15.0mW 28.0mW
Total 58.5mW 40.5mW
TX swing 700mV
The second design issue is about the output voltage level. Because the differential output of LVDS is not biased to relative VDD or to gnd. the differential output will have a large variation due to process variation and cable reflection.
Therefore, we need a common-mode closed loop circuit to control the Vocm voltage and to ensure the output swing level both in TX output and RX input [23]. On the contrary, there is no complicated common-mode range design issue in CML architecture and the signal is simply delivered by two NMOS input switch.
The third design issue is about the current source. In LVDS structure, two current sources are connected to power and ground respectively to minimize the change of
current to reduce noise. Since these two tail current sources are equal, the power dissipation from the voltage supply Vocm is ideally zero. Actually, there might be some mismatches between the top and bottom current source. Thus, a replica circuit must be used to ensure the two sources remain equal over process, voltage, and PVT.
This will cause extra power and enlarge layout area. Even though when we eliminate the mismatch, these two current sources will still create a large voltage drop and limit the output voltage swing. On the contrary, CML structure performance is less sensitivity to voltage drop or mismatch and can still provide a large output swing and high data rate solution. Therefore, for over several Gbps high speed serial link transmission system, due to the power dissipation and layout area consideration, we choose CML structure as our transmitter driver.
2.3.4 Summary
Finally, we choose CML technology as our data driver structure. The CML technology is a basic signal driver that can be applied to define the output characteristic of a transmitter and inputs of a receiver with the protocol for a chip-to-chip interface between I/O peripheral. The high speed, low power and low noise is the goal we concerned to design a transmitter circuit. In the next chapter we start to introduce our proposed transmitter architecture and explain the pre-emphasis method to overcome the bandwidth of the line as the length is increased.
Chapter 3
Architecture of Serial ATA Transmitter
3.1 Introduction
A general serial link is shown in Fig. 3.1. There are three primary components in this architecture which are transmitter, cable, and receiver [16][17]. The input data are usually parallel data stream in order to increase the bandwidth of the link. Therefore, we need a PISO (parallel in serial out) circuit to convert digital bits into differential bit stream. Then, the serial data are sent through transmitter driver to the cable. In the receiver end, it recovers the signal to the original digital bits from cable by amplifying and sampling the signals. The CDR (clock data recovery) circuit embedded in the receiving side adjusting the receiver clock based on the receiver data to sample the center of the data eye. Then, a SIPO (serial in parallel out) circuit converts the serial
data back to N parallel bits. This dissertation focuses on the transmitter end and cable.
We introduce the detail components in the following section.
Fig. 3.1Tranceiver architecture
3.2 The Architecture of Serial ATA Transmitter
Fig. 3.2 shows the architecture of the SATA transmitter. This architecture is composed of transmitter and PLL. The circuit elements designed in this dissertation are in dotted blocks. This transmitter starts with two kinds of input parallel data streams, which are K28.5 and PRBS (Pseudo Random Binary Sequence) selected by AMUX. Through K28.5 pattern we can test and verify if the output data streams meet with input data or not. And we can measure the output data eye diagram via PRBS pattern. Then the input patterns are delivered to BMUX. BMUX selects the first five or last five parallel data transmit to PISO cycle by cycle. In order to transmit data in high speed link, we use 5-to-1 PISO circuit. This circuit converts the parallel data streams into serial data streams by using 1.2 GHz 5 phase clock. After all, CML driver transmits 6Gbps serial data into cable. Besides, we can compensate the large loss of cable by using tunable pre-emphasis circuit. According to different cable length with different data rate, we can choose the suitable pre-emphasis amount.
Section 3.3 will show the design of K28.5 and PRBS encoder, the synchronization and PISO architecture, CML and pre-driver architecture, and a tunable pre-emphasis filter decided concept.
Fig. 3.2 Transmitter architecture
3.3 Functional Blocks of Transmitter
3.3.1 PRBS and K28.5
The PRBS circuit works as a data generator. The function of the PRBS is to generate random data in a long term period. It means there is the same numbers of the ones and zeroes during the long term period and it will make the power spectrum density more evenly distributed. It generates all possible patterns without all 0 patterns. The maximal length sequence is 2N-1. In order to implement ten parallel inputs, a 210-1 data pattern is generated with 10 registers and XOR circuits. We design
the ten bits PRBS encoder with each 600MHz data rate to get a 6Gbps of system operational speed. We use a control signal to provide a pulse signal (logic one) to restart the circuit to generate the parallel data. A 600MHz clock is also needed to trigger the register.
Another input pattern is K28.5. This pattern is commonly specified for jitter measurement in Fiber Channel and Ethernet systems operation. We use this pattern to verify that the circuit can transmit data correctly into cable and the output eye diagram can fit the SATA specifications.
The K.28.5 pattern has two sequences (composed of alternating K28.5+ and K28.5-), the positive disparity (0011111010) and the negative disparity (1100000101).
These two sequences form the symbols 00111110101100000101. These long symbols contain five consecutive 1's and five consecutive 0's, (the longest DC data). It also contains an isolated 1-010-and an isolated 0-101, (the high speed AC transition).
3.3.2 AMUX and BMUX
In order to gain the bandwidth of the circuit, we use two MUX circuits shown in Fig. 3.3. AMUX is composed of 10 sets of 2-to-1 MUX which select the PRBS or K28.5 input pattern. The ten parallel input data are divided into two groups by BMUX, this is for high speed considerations shown in Fig. 3.3. The 10 bits data are divided into first 5 bits data and last 5 bits data by a half speed clock phase 1b. Then, these 5 bits data are transferred into synchronizer circuit. The synchronizer circuit will skew the parallel data by the different phase and the PISO circuit will sample each data into serial stream to transmit to the CML data driver.
Fig. 3.3 Timing Diagram of the transmitter
3.3.3 Synchronizer
The parallel data from BMUX should be shifted to fit the sampling time of PISO multiplexer, therefore, we need a synchronizer circuit to skew the parallel input data which makes the PISO circuit sample the parallel data one by one in the middle point
The parallel data from BMUX should be shifted to fit the sampling time of PISO multiplexer, therefore, we need a synchronizer circuit to skew the parallel input data which makes the PISO circuit sample the parallel data one by one in the middle point