1.2Gbps更小擺幅差動訊號傳輸模式收發器

(1)

國立交通大學

電子工程學系電子研究所碩士班

碩士論文

1.2Gbps 更小擺幅差動訊號傳輸模式收發器

A 1.2Gbps RSDS Serial-link transceiver

研究生 : 邱啟祐

指導教授 : 吳錦川教授

中華民國九十四年八月

(2)

(3)

1.2Gbps 更小擺幅差動訊號傳輸模式收發器

A 1.2Gbps RSDS Serial-link transceiver

研究生：邱啟祐 Student :

Chi-Yu

Chiu

指導教授：吳錦川教授 Advisor : Prof. Jiin-Chuan Wu

國立交通大學

電子工程學系電子研究所碩士班

碩士論文

A Thesis

Submitted to Department of Electronics Engineering & Institute of

Electronics

College of Electrical Engineering and Computer Science

National Chiao Tung University

In Partial Fulfillment of the Requirements

for the Degree of

Master of Science

In

Electronic Engineering

Aug 2005

Hsin-Chu, Taiwan, Republic of China

(4)

(5)

1.2Gbps 更小擺幅差動訊號傳輸模式收發器

學生:邱啟祐

指導教授:吳錦川博士

國立交通大學電子工程學系電子研究所碩士班

摘要

由於積體電路製程上技術的進展，在晶片間的資料傳輸所要求的速度與傳輸資料量也因應的提升，但如何在達到高速傳輸的目的下卻不造成空間與功率的浪費，而在現今以高速序列傳輸方式為主流中，具有高速、低功率、低雜訊干擾特性的(RSDS)更小擺幅差動訊號傳輸方式的技術是頗受歡迎的。本篇論文在研究 RSDS 傳輸模式下以 1.2Gbps 的傳輸速度運作的收發器架構，當中分為傳輸與接收兩個部份，並以 tsmc 0.352P4M CMOS 的製程技術在電壓電源為 3.3V 的情況下進行模擬。傳輸器利用一個鎖相迴路來提供時脈和多工器將資料由並列轉為序列輸出。鎖相迴路的輸入頻率為 75MHz，輸出頻率鎖在 150MHz 並提供八個相位的時脈給多工器使用，並在時脈與資料間先進行預先位準調整，再經由八對一多工器輸出可得 1.2Gbs 的資料頻率輸出，該接收器的消耗功率為 134mW。接收器使用一具磁滯現象比較器將接收訊號放大為數位訊號。再利用一操作在輸入資料一半頻率、且具有頻率、相位雙向追蹤的時脈資料回復電路來將資料與時脈對準，最後由一對八解多工器將資料轉回並列。該接收器的功率消耗為 164mW。

(6)

(7)

A 1.2Gbps RSDS Serial-link transceiver

Student: Chi-Yu Chiu Advisor: Prof. Jiin-Chuan Wu

Department of Electronics & Institute of Electronics

National Chiao-Tung University

Abstract

Due to the improvement of IC fabrication technology, the speed and amount of inter-chip data transmission has also been required more. The problem is how to make high speed transmission without wasting space and power. Among the main stream, high speed serial ports, RSDS technology with high speed, low power and low EMI character is popular now.

This thesis describes the design of a high-speed RSDS transmission interface with 1.2Gbps rate. The transceiver includes transmitter and receiver and is simulated in a TSMC 0.35μm 2P4M process and at 3.3V supply voltage.

The transmitter makes use of a PLL to provide the 8-phase, 150MHz clock for the multiplexer and translate the parallel data to be serial and the input frequency of PLL is 75MHz.The data and clock is pre-skewed to adjust the accuracy .Then with the 8-phase clock and 8 to 1 multiplexer, the output data can be transmitted at 1.2Gbps data rate. And the total power of the transmitter is 134mW.

The receiver uses the comparator with hysteresis to amplify the incoming data to full swing, and uses (CDR) clock and data recovery with phase and frequency

(8)

detectors to lock the clock with better jitter performance. Finally, the 1 to 8 de-multiplexer converts the CDR output to 8 parallel data channels. The total power of receiver is 164mW.

(9)

誌謝

首先，我要感謝我的指導老師吳錦川教授，在碩士班兩年的研究生涯中，悉心地指導我，不論是專業知識的培養，或是做研究的態度和處理問題的方法，都讓我獲益良多。其次，也要感謝陳巍仁教授、藍正豐學長、張恆祥學長撥冗擔任我的口試委員，並且提供我不少寶貴的意見。論文研究能夠完成，要感謝在 307 實驗室的諸多學長，謝謝你們這兩年的指導，並要感謝阿瑞、周政賢、權哲等學長的教導，讓我獲益良多，在此衷心的感謝你們。還要感謝一同在 527 奮鬥的夥伴，鍵樺、志朋、傑忠、峻帆、靖驊、弼嘉、建樺、阿信、瑋銘、岱原，特別感謝同屬吳錦川老師旗下的各位伙伴們，在平時一起研究討論而在研究之餘能夠互相打氣並一同歡樂，使的課業繁重的研究生生活增添了許多的樂趣與活力，另外要感謝我的父母與我的家人，謝謝父母從小以來栽培我所花的勞心與勞力，並在我繁忙與失意的時候給我最大的支持與鼓勵，並給予我許多的人生方向上的建議，最後要感謝我的女朋友福真，感謝你陪我度過這求學階段最艱辛也最重要的一刻，因為有妳的相陪，使我能夠一路堅持到底的努力。謹以此篇論文獻給所有關心我的人與我所關心的朋友。邱啟祐國立交通大學中華民國九十四年八月

(10)

Abstract (Chinese)

... i

Abstract (English)

... ii

List of Tables

... viii

List of Figures

... ix

Chapter 1 Introduction

1.1 Motivation... 1 1.2 Introduction of RSDS ... 2 1.2.1 RSDS/LVDS... 2 1.2.2 Applications RSDS/LVDS... 3 1.2.3 The Trend of RSDS ... 3 1.3 Thesis Organization ... 4

Chapter 2 Background

2.1 RSDS Specification ... 7

2.2 Basic Serial Link ... 9

2.3 Noise Issue... 10

2.3.1 Cross-talk... 11

(11)

2.3.3 Power Supply Noise ... 13

2.4 Signaling Circuits... 15

2.5 Timing Recovery Architecture... 17

2.5.1 PLL-based Architecture ... 17

2.5.2 Oversampling Phase-picking Architecture... 20

Chapter 3 Phase-Locked-Loop

3.1 Introduction... 23

3.2 Phase-Locked Loop Architecture ... 23

3.3 Circuit Implementation ... 24

3.3.1 Phase Frequency Detector (PFD) ... 24

3.3.2 Charge Pump... 28

3.3.3 Voltage Control Oscillator (VCO) ... 30

3.3.4 Loop Filter ... 35

3.3.5 Divider... 36

3.4 Fundamentals of PLL ... 37

3.4.1 PLL Linear Model... 37

3.4.2 PLL Noise Analysis and Stability... 39

3.5 Loop Parameters Consideration... 40

Chapter 4 Transmitter

4.1 Architecture of Transmitter ... 45

4.2 Pseudo Random Bit Sequence (PRBS)... 46

(12)

4.3.1 The Algorithm for Parallel to Serial... 47

4.3.2 MUX Architecture... 49

4.3.3 The 8:1MUX with pre-skew circuit ... 52

4.4 Data driver... 55

4.5 Simulation Result of Transmitter ... 56

4.5.1 Simulation Result of PLL... 57 4.5.2 Architecture Comparison ... 58 4.5.3 Layout of transmitter ... 60

Chapter 5 Receiver

5.1 Architecture of Receiver ... 63 5.2 Slicer... 63

5.3 Clock and Data Recovery... 66

5.3.1 Introduction... 66

5.3.2 Architecture of CDR ... 66

5.3.3 Half-rate Phase Detector ... 67

5.3.4 Half-rate Frequency Detector ... 70

5.4 Linearization Circuit ... 73

5.5 Parameters of CDR... 75

5.6 De-Multiplexer ... 76

5.7 Receiver Simulation Result ... 79

Chapter 6 Conclusion and Future work

6.1 Conclusion ... 83

(13)

6.2 Future Work ... 84

(14)

LIST OF TABLES

Table 1-1 RSDS/LVDS comparison [1] ...2

Table 1-2 RSDS/LVDS applications ...3

Table 2-1 Electrical specification of RSDS transmitters and receiver ...8

Table 2-2 Comparison between full-rate and half-rate timing recovery architectures ...19

Table 3-1 Noise transfer function...39

Table 3-2 Parameter of the transmitter PLL...43

Table 4-1 the deductive logic of 3-levels multiplexer ...48

Table 4-2 Algorithm Result of the First Level ...53

Table 4-3 Algorithm Result of the Second Level ...54

Table 4-3 Algorithm Result of the Third Level ...54

(15)

LIST OF FIGURES

Figure 1-1 Block diagram of the LCD nodule [2] ... 4

Figure 2-1 RSDS swing level ... 8

Figure 2-2 Block diagram of the basic serial link ... 9

Figure 2-3 Cross-talk ... 12

Figure 2-4 Transmitter timing diagram with different transmitter architectures: (a) voltage-mode, (b) current mode, and (c) differential. ... 16

Figure 2-5 Timing recovery architecture (a) PLL-based (b) oversampling phase-picking ... 17

Figure 2-6 (a) Full-rate data and clock (b) Half-rate data and clock... 19

Figure 3-1 Block diagram of a phase locked loop ... 24

Figure 3-2 tri-state diagram of the phase detector ... 25

Figure 3-3 reference signal comes after feed back signal ... 25

Figure 3-4 structure of PFD ... 26

Figure 3-5 Dynamic D Flip-Flop TSPC... 26

Figure 3-6 PFD transfer characteristic curve... 27

Figure 3-7 PFD transfer character curve with dead zone ... 27

Figure 3-8 Charge pump with charge injection effect ... 29

Figure 3-9 Schematic of charge pump ... 29

Figure 3-10 Schematic of the four stages VCO ... 30

Figure 3-11 Schematic of VCO delay cell with symmetric load elements... 30

Figure 3-12 The symmetric load I-V curve... 31

Figure 3-12 Replica-feedback current source bias circuit... 33

(16)

Figure 3-14 Schematic of duty-cycle corrector and its timing diagram... 35

Figure 3-15 2nd order passive loop filter ... 36

Figure 3-16 (a) TSPC asynchronous divided-by-two circuit (b) divider scheme ... 36

Figure 3-17 PLL linear model ... 37

Figure 3-18 PLL linear model with various equivalent noise sources... 39

Figure 3-19 open loop PLL frequency response ... 41

Figure 3-20 Vctrl timing diagram... 42

Figure 3-21 Kvco curve ... 42

Figure 4-1 Block diagram of the transmitter...45

Figure 4-2 PRBS delay cell circuit... 46

Figure 4-3 Scheme of Pseudo Random Bit Sequence (PRBS)... 47

Figure 4-4 Timing diagram of 8:1 multiplexer... 47

Figure 4-5 Multi-phase Type MUX ... 49

Figure 4-6 Architecture of the 3-levels multiplexer ... 49

Figure 4-7 Scheme of 2:1MUX Cell... 51

Figure 4-8 Delay Match Buffer ... 51

Figure 4-9 Pre-skew Circuit ... 52

Figure 4-10 (a) RSDS transmitter data driver (b) Common mode feedback circuit ... 55

Figure 4-11 Simulation Environment ... 56

Figure 4-12 Eight-phase VCO clock... 57

Figure 4-13 eye-diagram of VCO clock... 57

Figure 4-14 eye-diagram of output (Multi-phase Type MUX)... 58

Figure 4-15 eye-diagram of output (3-levels MUX without pre-skew) ... 59

Figure 4-16 eye-diagram of output (3-levels MUX with pre-skew) ... 60

(17)

Figure 4-18 Layout of transmitter... 61

Figure 4-19 post simulation of transmitter ...61

Figure 5-1 Block diagram of the receiver ... 63

Figure 5-2 schematic of slicer... 64

Figure 5-3 Simulation of Hysteresis comparator ... 65

Figure 5-4 Frequency response of slicer... 65

Figure 5-5 Half-rate CDR architecture... 67

Figure 5-6 Half-rate phase detector ... 67

Figure 5-7 Timing scheme of the half-rate phase detector operation ... 69

Figure 5-8 perfect locked condition... 70

Figure 5-9 Transfer characteristic of phase detector... 70

Figure 5-10 Half-rate frequency detector ... 70

Figure 5-11 timing diagram of FD ... 71

Figure 5-12 Circular phase diagram ... 72

Figure 5-13 Up and down generator ... 72

Figure 5-14 Schematic of the linearization circuit ... 74

Figure 5-15 Transfer curve of the linear circuit ... 74

Figure 5-16 Transfer curve of VCO (Kvco=160MHZ/V - TT)... 75

Figure 5-17 Control voltage of VCO ... 76

Figure 5-18 Asynchronous tree-type 2:8 de-multiplexer ... 77

Figure 5-19 (a) 1:2 DEMUX (b) timing diagram ... 77

Figure 5-20 Illustration of 2:8 DEMUX paralleling the data (D0 is the example)... 78

Figure 5-21 Waveforms of received data and output of the slicer ... 80

Figure 5-22 Control voltage of VCO when CDR is in lock state ... 80

Figure 5-23 Input data and retimed clock while CDR in the lock state... 80

(18)

Figure 5-25 Serial input data and 8-parallel outputs of receiver ... 81 Figure 5-26 The variation of control voltage when the phase of input data is

(19)

Chapter 1 Introduction

1.1 Motivation

Recently, the advances of IC fabrication technology have led to an great growth of the integration levels of digital IC’s. For perfect performance, all high-speed components of a system should be integrated into a signal die. However, some technological obstacles forbid the implementation of System-On-a-Chip (SOC). Therefore, high speed links will be the key of the connection between different modules and chips. While improving the I/O speed, we also need to keep the circuit are a small and power consumption low so that we can make sure that integrating transmitter, receiver, and protocol control into a single chip will have the good performance[1].

The basic data link consists of the components such as transmitter, receiver, and channel. The transmitter translates the incoming digital data to analog level and converts the data into a serial data stream on the channel to receiver. A high-level and a low-level are the logical value in the analog system as 0 and 1 are the logical value to the digital system. In order to detect the logic level of analog waveform from the channel, the analog waveform needs to be amplified in the front of receiver. The timing recovery circuit is additional part in receiver to resolve the input into the needed clock. Finally, the receiver converts the serial data to the parallel data.

(20)

In this thesis, the achievement is to design a CMOS serial link transceiver based on the RSDS interface and meet the specification for delay, cost, data mapping, power consumption and logic threshold variation. RSDS means Reduced Swing Differential Signaling. It’s a way to transmit data with very low differential swing (200mv) over two printed circuit board (PCB) traces or a balance cable. The following section will show RSDS in more detail.

1.2 Introduction of RSDS

1.2.1 RSDS/LVDS

Reduced Swing Differential Signaling, like it’s predecessor LVDS (Low Voltage Differential Signal), originated from the LCD Manufacture’s unique need for on glass interface with high speed, reduced interconnection, lower power, and a lower EMI. The following figure indicates the difference between RSDS/LVDS

Characteristic RSDS LVDS

VOD, output voltage swing +/- 200mV +/- 350mV

RTERM, Termination 100Ω 100Ω

IOD, output drive current 2mA 3.5mA

Data MUX 2:1 7:1

Content RGB Data RGB Data and

control Application Intra-system interface System-system interface Table 1-1 RSDS/LVDS comparison [2]

(21)

1.2.2 Applications RSDS/LVDS

Because of the benefits of the RSDS and LVDS low signal swing, the RSDS and LVDS are widely used standards of flat panel interfaces. The chart below shows some applications based on RSDS/LVDS interface.

PC/Computing Telecom/Datacom Consumer

Flat panel displays Switches Home/office Monitor link Add/drop multiplexers Set op boxes Printer engine link Box-to-Box

System clustering Routers Game displays/controls SCI processor interconnect Hubs In-flight entertainment

Table 1-2 RSDS/LVDS applications

1.2.3 The Trend of RSDS

The tendency of the TFT industry toward higher resolution displays requires a new low noise digital interface. The open RSDS technology offers us an industry leading technology platform. Combining the TFT display-related technology with a low power consumption, low noise interface like RSDS will accelerate developing new TFT driver families to achieve next-generation, high-performance TFT LCD modules.

Fig 1-1 illustrates a typical application block diagram of the LCD module. The RSDS bus is located between the Panel timing Controller (TCON) and the Colum Drivers. This bus is typical nine pair wide plus clock and is a multidrop bus configuration.

(22)

Figure 1-1 Block diagram of the LCD nodule [3]

1.2.4 The Benefits of RSDS

With RSDS technology, designers are able to reduce the size of circuit boards and

the bus interconnect, and eliminate discrete components typically used in TFT LCD modules. The XGA (eXtended Graphics Adapter) panel timing controller combined with a partner's RSDS-enabled XGA column driver form a powerful solution to reduce size, weight and cost.

(23)

The use of RSDS Technology also enables several other key features and benefits in these new display designs. Substantial power savings, critical in battery-operated and mobile applications can be realized without sacrificing performance and resolution. Significantly reduced EMI-radiated (electro-magnetic interference) noise can be achieved, lowering production costs by eliminating EMI shielding.[3]

1.3 Thesis Organization

This thesis is organized into six chapters and the first one is the introduction of the RSDS interface. Chapter 2 introduces more specification and background of RSDS interface transmission and shows the basic design of serial link. In Chapter3, the conception and architecture of Phase-Locked Loop (PLL) will be described. Chapter 4 shows the discussion of the transmitter architecture. High speed parallel to serial data conversion is achieved by means of time-division multiplexer toggled by a low jitter and 8-phases phase-locked loop. The transmitter simulation result is shown in the end. Chapter5 presents the building block of receiver. The clock and data recovery circuit will be introduced and the architecture with improved jitter performance is proposed. The frequency acquisition part design is also introduced .The whole simulation performance (including transmitter, cable and receiver) will be shown in the end of this Chapter. Chapter 6 is the conclusion of this thesis and shows the future work.

(24)

(25)

Chapter 2 Background

Chapter 2 describes the detail of RSDS specification, some terminologies and conceptions for transmission environment, some basic design architectures and some opinions for performance enhancement.

2.1 RSDS Specification

[5]

Reduced Swing Differential Signaling (RSDS) is a signaling standard that defines the output characteristics of a transmitter and inputs of a receiver along with the protocol for a chip-to-chip interface between Flat Panel timing Controllers and Column Drivers. RSDS which is a differential interface with a nominal signal swing tend to be used in display applications. It retains the many benefits of the LVDS interface commonly used between host and the panel for high bandwidth, robust digital interface. The RSDS provides many benefits to the applications which include:

• Reduced bus width – enables smaller thinner column driver boards

• Low power dissipation – extends system run time

• Low EMI generation – eliminates EMI suppression components and shielding

• High noise rejection – maintains signal image

• High throughput – enables high resolution displays

(26)

The Fig 2-1 below show the RSDS transmitter output swing level in single end and differential end. The RSDS has the waveform with low signal swing of 200mV. And the Table 2-1 below presents the electrical specification for a transmitter (TX) and receiver (RX).

Figure 2-1 RSDS swing level

TX/RX Parameter Definition Condition MIN TYP MAX Units

TX VOD Differential output voltage RL=100Ω 100 200 600 |mv| TX VOS Offset voltage -- 0.5 1.2 1.5 V TX IRSDS RSDS driver current -- 1 2 6 ma TX TR/TF Transition 20%to80%, VOD=200m V CL=5pf -- 500 -- ps TX -- RSDS clock duty cycle -- 45 50 55 % RX VTH Differential threshold -- +/-100 _mV RX VCM Input common mode voltage -- 0.3 1.5 V RX IL RSDS RX input leakage -- -10 10 μA

(27)

2.2 Basic Serial Link

As shown in the below Fig 2-2, the common components of the basic serial link are transmitter, channel and receiver. In order to increase the bandwidth of the link, the data are usually parallel before being sent by the transmitter. The transmitter converts the digital information to analog level on the transmission medium. The driver makes the analog signal be differential. The medium on which the signal travels, e.g. coaxial cable or twisted pair, are commonly called the communication channel. The receiver in the end of channel recovers the incoming signal to the original digital information by amplifying and sampling the signal. The termination resistor which matches the impedance of the channel could minimize the signal reflection. The circuit at receiver, the clock and data recovery adjusts the receiver clock based on the received data to let the sampling point fix the center of the data eye. Finally, the serial to parallel interface converts the serial data back to N parallel bits data. Parallel to Serial Serial to Parallel Clock recovery Tx Rx Rterm Rterm Channel N N

Figure 2-2 Block diagram of the basic serial link

The performance of a link is mainly characterized by the data bandwidth. The another important parameter of link performance is the bit error rate (BER) , a measure of how many bit errors are made per second. A link’s maximum data rate is specified at the specific BER to guarantee the robustness of the overall system. BER

(28)

is important not only because it reduces the effectiveness of a system’s bandwidth, but also because in many systems, applications of error correction techniques can prohibitively increase the system cost. The errors are caused by the noise from each part of the system. The intrinsic sources of noise are the random fluctuation due to the thermal vibration and shot noise of the positive and active system components. In VLSI applications, other non-fundamental sources of noise also limit the performance of link. The noise sources include coupling effect from other channel, the mutual inductance, switching activity from other circuits integrated with the link circuit, and the reflections induced from the link imperfections. These types of noise typically have non-white frequency spectrum, and exhibit with the strong data correction. Moreover, the overall power is often proportional to the power of the signals. Therefore, there are two main issues in designing high-speed serial link interface circuit: signaling and clock.[6]

2.3 Noise Issue

When selecting a particular signaling or a clocking scheme, the primary goal is to transmit data between system components with the maximum bandwidth, while keeping the low associated cost low. These costs include the power consumption and the area occupied by the signaling and synchronization circuits, as well as the cost of the required external component. Unfortunately, the noises in the digital system make it difficult to achieve the objective. The noises influence the amplitude and timing of transmitted signal, thus the impact impels the correct reception. These noises are either relative to, or independent of the original transmitted signal amplitude. The problem of independent noise can be easily overcome by reinforce the amplitude of

(29)

the transmitted signals. But it is more arduous to solve the problem of the proportional noise source. This type noise only can be minimized or erased by designing the signaling circuit and transmitting environment carefully. The most critical proportional noise sources are cross-talk, reflection and self-induced power supply noise. In this section, these types of noise sources and the methods commonly used to deal with them will be discussed.

2.3.1 Cross-talk

[7]

The problem of cross-talk and how to deal with it is becoming more important as system performance and board densities increase. Our discussion on cable-to-cable coupling described cross-talk as appearing due to the distributed capacitive coupling and the distributed inductive coupling between two signal lines. When the cross-talk is measured on an undriven senses line to a driven line (both terminated at their characteristic impedance), the near end cross-talk and far end cross-talk have quite distinct features, as shown in the Fig 2-3. It should be noted that the near end component reduces to zero at the far end and vice versa. At any point in between, the cross-talk is a fractional sum of the near and far end cross-talk waveforms as shown in the figure. It also can be noted that the far end cross-talk can have either polarity whereas the near end cross-talk always has the same polarity as the signal causing it.

The amplitude of the noise generated on the undriven senses line is directly related to the edge rates of the signal on the driven line. The amplitude is also directly related to proximity of the two lines. This is factored into the coupling constants KNE

and KFE by terms that include the distributed capacitance per unit length, and the

(30)

causes “noise” voltages to appear when adjacent signal paths switch.

Figure 2-3 Cross-talk

Several useful observations that apply to a general case can be made:

• The cross-talk always scales with the signal amplitude VI.

• Absolute cross-talk amplitude is proportional to skew rate VI / tr, not just 1/ tr .

• Far end cross-talk width is always tr.

• For tr < 2TL, when tr is the transition time of the signal on the driven line and

TL is the propagation or bus delay down the line, the near end cross-talk

amplitude VNE expressed as a fraction of signal VI is KNE which is a function

of physical layout only.

• The higher the value of “tr” (slower transition times) the lower percentage of

cross-talk (relative to signal amplitude).

From these above points, the goal of serial link, high-speed transmission, makes the effect of cross-talk worse and more significant. The methods to reduce the amplitude of the cross-talk include: diminishing the amplitude of transmission data, arranging the layout carefully to reduce the coefficient KNE (the value of mutual capacitance and

(31)

mutual inductance), lessening the times of the signal transition by coding the data and techniques like slew-rate control of driver output signals.

2.3.2 Reflection

Reflection-induced inter-symbol interference is the most common type of proportional noise on the serial link. Like the Fig 2-2, signal lines must be terminated. This can be accomplished by setting termination circuits on either the transmitter or the receiver end of line. The use of the termination circuit is to absorb the transmitted signal energy, and avoid it reflected back into transmission medium.

The reflection of signal is given by [8]

V

reflected

=

ρ

V

incident (2-1) 0 0 L L L Z Z Z Z ρ = − + 0 0 S S S Z Z Z Z ρ = − + (2-2)

Where: ρ =load reflection coefficient, L ρ =source reflection coefficient, ZS L=load

resistance, ZS=source driving-point resistance, Z0=transmission line impedance

Terminating both at source and destination ends of the transmission medium can be used to alleviate this problem at the expense of increased power dissipation. Automatic impedance control can also be used to reduce reflection noise by dynamically adjusting the termination resistor to match the interconnection characteristic impedance [9].

(32)

Self-induced power supply noise is a result of the finite supply pin impedances in the semiconductor package. Power supply noise is perhaps the most important contributor to system noise. When any element switches logic state, the current drawn from the external supply of the chip changes at a rate equal to .The inductance L of the supply voltage bonding wire will then cause the on-chip power supply voltage drop by a voltage / dI dt di V L dx

Δ = .If the drop becomes too large; it can cause the internal logic error. Even a supply spike on one circuit’s output could feed an extraneous noise voltage into the next device’s input. It is a problem in almost every digital system. However, power supply noise is generally not a dominant voltage noise in the differential links. Sending complementary signals allow the total current draw from (and discharged to) each power supply to be constant, eliminating large current spikes across the power pin inductors or power distribution inductance. Moreover, since the differential pairs are nicely balanced, to the first order, any power supply noise

coupled to the signal pair at both the transmitter and the receiver are common-mode. Although power supply noise affects different systems by different degree, its

presence in digital systems has stimulated enormous research efforts in techniques to reduce the noise. Such techniques include minimizing the inductance of power distribution networks, employing constant-current drivers or more generally keeping the total current drawn from each supply constant, increasing the bypass capacitance both on the chip and on the board, using separate power supplies for noise-sensitive circuit, generating on-chip supplies using voltage regulators, slowing down signal transition using slew rate control [10], and using coding schemes that reduce switching frequency of signals [11].

/

dI dt

(33)

2.4 Signaling Circuits

The noise sources mentioned in Section 2.2 all are proportional to transmitted signal amplitude and hence cannot be overcome by simply increasing the signal swing. Therefore, these noise sources are the primary types of noise that the transmitter and receiver must deal with.

The transmitter drives a HIGH or LOW analog voltage onto the channel and is designed for a particular output-voltage swing based on the system specification. The design issues are to maintain small voltage noise and timing noise on the signal. There are two types of output drivers to drive the output: voltage-mode drivers and current-mode drivers. Voltage-mode drivers, as shown in Fig 2-4 (a), are switches that switch the line voltage. Because the switches are implemented with transistors, the driver appears as a switched resistance. To switch the voltage fully, a small resistance is needed which typically requires a large switching device. In contrast, current-mode drivers, as illustrated in Fig 2-4 (b), are switching current sources. The output impedance of the driver is much higher than the line impedance. It is also called high impedance signaling. Therefore, the transmitter bandwidth is typically not an issue even with significant output capacitance. The voltage to be transmitted on the line is determined by the switched current and the line impedance or an explicit load resistor. The driver can be simply implemented by biasing the MOS transistor in its saturation region. Current-mode drivers are slightly better in terms of insensitivity to supply-power noise because they have high output impedance and hence the signal is tightly coupled only to VOH, the signal return path. The output current does not vary

(34)

ground signal. The disadvantage with current-mode drivers is that, in order to keep the current sources in saturation, the transmitted voltage range must be well above ground that increases power dissipation.

Figure 2-4 Transmitter timing diagram with different transmitter architectures: (a) voltage-mode, (b) current mode, and (c) differential.

For better supply-noise rejection, the differential mode can be adopted, as shown in Fig 2-4 (c), because the supply noise is now common-mode. Since the current remains roughly constant, the transmitter induces less switching noise on the supply voltage that could benefit other transmitted or received signals on the same die. To reduce reflections at the end of the transmission line, the transmitter needs to be terminated. An off-chip termination resistor could introduce significant impedance mismatches because of the package parasitic components. To incorporate the resistor, with current-mode drivers, an explicit on-chip resistor at the driver can act as the termination resistor. If a resistive layer is not available, a transistor in its linear region can be used as the resistor. With voltage-mode drivers, the design is slightly more complex because the switch resistance should match the line impedance Z0. This may

(35)

compensating with an external series resistor, as shown in the Fig 2-4 (a).

2.5 Timing Recovery Architecture

2.5.1 PLL-based Architecture

The task of the timing recovery circuit is to recover the phase and frequency information from the transition in the received data stream. The optimal sample point is midway between the possible data-transition times. Noise and mismatches inherent to the timing recovery circuit produce jitter in the sampling clocks, which degrade the timing margin. Moreover, the transmitter jitter causes uncertainty in the transition points makes clock extraction more difficult. As shown in Fig 2-5, two types of timing recovery architectures have been used in links. One is the PLL-based (data-recovery PLL) [12] and the other is the oversampling phase-picking [13].

Figure 2-5 Timing recovery architecture (a) PLL-based (b) oversampling phase-picking

(36)

controls the internal phase by adjusting the frequency of the voltage controlled oscillator (VCO) with Vctrl signal until the frequency matches that of an external reference. A phase detector detects the phase difference between the sampling clock and the external input data signal, and adjusts the VCO control voltage. A phase detector generally drives a charge pump that converts the phase difference into a charge. A filtered version of this charge becomes the VCO control voltage. Based on the phase information of the data, the best sample is chosen as the data bit by some decision logic. To maintain good phase relationship between the sampling clock and the data transitions, the PLL should detect the input phase accurately and track any input jitter with a high loop bandwidth. Unfortunately, the stability limits the loop bandwidth of the system. Because the timing information is embedded in the data system, coding of the data is used to ensure a minimum and maximum transition density. High data transition density in the data stream is preferred since it could maintain the stability of the system.

PLL-based timing recovery architectures can be categorized into full-rate and half-rate architectures. In a full-rate circuit the position of the data transition is compared to the falling edge or rising edge of the clock and clock frequency is equal to the data rate as shown in Fig 2-6 (a). Single edge triggered flip flop can be used to retime the data. On the other hand, the location of the data transition is compared to both rising and falling edges of the clock in a half-rate circuit and the clock frequency is equal to one half of the data rate as shown in Fig 2-6(b). Due to the one half of the clock frequency, double edge triggered flip flop is needed to perform the data retiming.

The most important advantage of half-rate architectures is the reduction of the circuit speed by a factor of two. This often means the reduction of the total power dissipation. In fact, as the operation speed of circuits approaches the maximum

(37)

operating frequency of a particular technology, the required power consumption grows exponentially. In addition, the de-multiplexing performed simultaneously by half-rate architecture is another attractive feature that makes them suitable for serial link architecture. It can reduce the complexity, hardware, and power dissipation of the deserializer.

Figure 2-6 (a) Full-rate data and clock (b) Half-rate data and clock

The duty cycle mismatch is a major concern in employing half-rate timing recovery architecture. If the spacing between the rising and falling edges of the clock signal is different from half to the clock period, the width of the data eye sampled by the rising edge is different from that sampled by the falling edge, resulting in bimodal jitter. So the duty cycle of the clock signal must be considered carefully in the design of half-rate timing recovery architecture.

Full-Rate Half-Rate

Circuit Operation Speed Symbol Rate Half of the Symbol Rate

Number of Clock Phase Single Clock Phase Dual Clock Phase

DeMux None Can do 1:2 DeMux

Clock Duty Cycle Not Important Important

Jitter Tolerance Margin Larger Lower

(38)

architectures[14][15]

2.5.2 Oversampling Phase-picking Architecture

The second timing recovery scheme is the oversampling phase-picking as shown in Fig 2-5 (b). Instead of using feedback loop to control the sampling phases, the data stream is sampled at multiple phase positions per bit creating an oversampling representation of the data stream. It does not require data coding or frequency acquisition since the system clock is readily available through the clock channel. What has to be handled is to adjust the skew between the clock and received data streams. Transitions in the data can be extracted from the sampled data. Based on the data transitions, the sample position nearest the center can be chosen as the data bit. The way to choose data is determined by different digital algorithms, like majority voting [16]. The phase-picking architecture has several advantages. First, it replaces the feedback loop with a feed-forward loop, allowing the selected sample to track phase movements of the data with respect to the clock without an intrinsic bandwidth limitation. The maximum tracking rate is limited by the transition information present. This fast tracking can potentially track the transmit PLL’s jitter accumulation. A second advantage of the phase-picking architecture is that long PLL phase-locking time is not needed. Phase decisions are made whenever input transitions are present. The primary disadvantage of the architecture is that there is an inherent static phase error due to the phase quantization. Higher oversampling ratios could reduces the static phase error but add significant complexity to the design. Furthermore, inherent sampler uncertainty limits the minimum quantization error. More significantly, the increased number of samplers increases the input capacitance, hence limiting the input bandwidth. Therefore, the architecture has a trade-off between the input bandwidths

(39)

and static phase offsets. For high input bandwidths, the tradeoff favors a low oversampling ratio with the penalty of higher static phase offsets due to the coarse quantization. Besides, due to the open loop mechanism, an error may occurs when sampling point just stands on the data edges, which is not a good position for sampling time, This condition is usually introduced by the static phase error between clock and signal, i.e. the timing skew. However, the feed-forward loop could not offer a mechanism to eliminate the effect of timing skew, which may cause the design complexity of the decision algorithm.

(40)

(41)

Chapter 3 Phase-Locked-Loop

3.1 Introduction

A phase-locked loop (PLL) is basically an oscillator whose phase and frequency is locked to certain times of input, reference frequency. PLL is a widely used analog circuit. It can be used to recover a clock from the input data, perform synchronization, frequency synthesizer, and generate multiple phases with equal phase resolution. Recently, the PLL designs play a key role in the link performance due to the demand of higher bandwidth in high-speed link. In this chapter, a charge-pump type PLL will be introduced. This circuit with 75MHz reference frequency input generates a clock signal at 150MHz. By adopting four differential stages in voltage controlled oscillator, it generates eight clock phases for the use of the eight-to-one multiplexer.

3.2 Phase-Locked Loop Architecture

The block diagram of a typical PLL circuit is shown as the Fig 3-1. The structure consists of the following circuit: a Phase-Frequency Detector (PFD), a Charge Pump, a Loop Filter, a Voltage-Controlled Oscillator and a Divider. The PLL output frequency is twice as fast as the input frequency. Therefore, a divided-by-2 circuit is needed. The internal signal generated by PLL system is called Fback and the external

(42)

signal given from outside is called by Fref. These two signal is compared by using the PFD and the PFD generates the adjusting signals, Up and Down to charge pump. The adjusting signals will control the current to charge or discharge the Loop Filter. The VCO is a circuit to generate a clock signal with the adjustable frequency. The frequency depends on the voltage Vctrl and the relationship is an inverse ratio. The Loop Filter is commonly a low-pass filter and provides extra poles and zeros to suppress the high-frequency signal from the PFD. After series of comparison, while the phase difference between Fback and Fref will be constant and the frequencies of

Fback and Fref will be nearly the same, this means the PLL is “locked”.

Figure 3-1 Block diagram of a phase locked loop

3.3 Circuit Implementation

3.3.1 Phase Frequency Detector (PFD)

The PFD is a digital sequential circuit to detect the input phase difference between Fref and Fback. It generates two logic signals “Up” and “Down”. According to the logic signals, the PLL system works at the tri-state operation as shown in Fig 3-2. The tri-state operation allows a wide range of detection forΔφ= 2± π. It detects both phase error and frequency difference.

(43)

Figure 3-2 tri-state diagram of the phase detector

In the Fig 3-2, the state Up=1 and Down=1 never occurs. The UP and Down have individual usage. UP is used to increase the frequency of the signal Fback. In contrast, Down is used to decrease the frequency of Fback. In the case that the

reference signal lags the feedback signal as shown in Fig 3-3, Down will be set high from low, and on the rising edge of reference signal, the Up will be set high. Thus, the reset is set to high at almost the same time to pull both Up and Down low. In the opposite case that the reference signal leads the feedback signal, the Up will be set high first and the Down and reset will be set high while the rising edge of feedback signal arrivals. Repeating these operations for a long time, the PLL will synchronize the reference signal and feedback signal. Therefore, the PLL is “locked” and both Up and Down will keep low.

(44)

Generally, the framework of PFD consists of two D-flip-flops, one NOR gate and one delay circuit as shown in Fig 3-4. In this part, the True-Single Phase Circuit (TSPC) type D flip-flop is used; Fig 3-5 shows the architecture of the PFD.

Figure 3-4 structure of PFD

Figure 3-5 Dynamic D Flip-Flop TSPC[17]

According to the PFD transfer characteristic curve as shown in the Fig 3-6, we can find that when the phase difference is small, the reset will be generated in a short

(45)

time. This condition causes that Up and Down signals may not reach the full swing and it is difficult to identify the logic signal for charge pump. Thus the loop filter will not be charged or discharged due to the very narrow pulse of the Up and Down signal. This occurrence is called dead zone. The dead zone is one kind of source of the output jitter. Because it allows the VCO to accumulate as much random phase error as the extent of the dead zone while receiving no corrective feedback to change the control voltage[18]. The dead zone problem is shown as Fig 3-7. In order to cancel the discrete part of the transfer curve, a delay circuit is added. If the delay time is precisely matched, the dead zone can be reduced. However, the PFD will have the limit on the maximum operation frequency that is proportion to total reset path delay [19]. Therefore, the delay time should be kept minimal.

Figure 3-6 PFD transfer characteristic curve

(46)

3.3.2 Charge Pump

The charge pump is a circuit that supplies current to the loop filter to adjust the control voltage of the VCO. However, the charge injection is an undesirable feature of charge pump. The injection effect is caused by the overlap capacitance of the switch devices and by the capacitance at the intermediate node between the current source and the switch devices.

Fig 3-8 shows a simple pump circuit, and the output is directly affected by the switching noise from the overlap capacitance of the switch deices. In addition, the intermediate nodes between the current source and switch devices will charge toward the supplies while the switch devices are off.

The charge injection effect will result in a phase offset at the input of the phase detector when PLL is in locked mode. Thus, the jitter will increase. When the charge pump current is diminished, the effect is comparatively in big scale, and the phase offset increases. In order to solve the problem, the control voltage must be isolated from the switch noise resulted from the overlap capacitance of the switch devices. Moreover, in order to fix the charge-sharing problem, an operation at amplifier can be adopted to buffer the output voltage to let the intermediate nodes switch to the output of the amplifier while the switches are off[20].

To combat the injection problem, a charge pump circuit is designed as shown as Fig 3-9. In this circuit, the switch devices M13 and M18 are isolated from the sensitive output Vctrl by inserting devices M17 and M18. When switching devices are off, the intermediate nodes between M13, M14, M17 and M18 will be charged toward the Vctrl by the gate overdrive of the current source devices. In order to make sure the matching between Ip and In, the cascade current mirror circuit is used. In addition, the

(47)

gate node of devices M16 and M11 are always connected to VDD and VSS directly. So, there are always constant currents flowing through M16, M11. Because of the full-swing signals Up and Down, the architecture makes sure that the output current can match the current on M11 and M16 precisely and quickly.

Figure 3-8 Charge pump with charge injection effect

(48)

3.3.3 Voltage Control Oscillator (VCO)

The building blocks of the VCO include a four stages ring oscillator and a self-biased replica-feedback bias generator. Fig 3-10 and Fig 3-11 shows the schematic of the four stages VCO and the delay cell.

Figure 3-10 Schematic of the four stages VCO

Figure 3-11 Schematic of VCO delay cell with symmetric load elements

The voltage control oscillator is critical and sensitive block in the PLL system. In order to have the low jitter characteristic performance of the output clock signal. In the mixed mode circuit, the delay buffer used in the section should have the low

(49)

sensitivity to the noise of the supply and substrate voltage. Therefore, the basic building block of the VCO used in this thesis is based on the differential delay stages with symmetric loads[21]. I-V curve of the delay stage with symmetric load is shown as Fig 3-12[22]. Although the I-V curve is nonlinear but is symmetrical to the center of the output voltage swing, and the delay stage has high noise immunity.

Figure 3-12 The symmetric load I-V curve

Based on the scheme as shown as Fig 3-11, the effective resistance of the symmetric load, is directly proportion to the small signal resistance at the end of the swing range that is one over the transconductance (gm) for one of the two equally sized devices when biased at control voltage. Thus, the delay per stage can be expressed by the equation:

eff R eff eff eff d C gm C R t = × = 1 × (3-1)

where Ceff is the effective delay cell output capacitance, Reff is the effective resistance

of delay cell. The drain current for one of the two equally sized devices at Vctrl is given by

(50)

2 ] ) [( 2 Vdd Vctrl Vtp k I_d = − − (3-2)

where k is the device transconductance of the PMOS device. Taking the derivative with respect to (Vdd-Vctrl), the transconductance is given by

] )

[(Vdd Vctrl Vtp k

gm= − − (3-3)

Combining (3-1) with(3-3), the delay of each stage can be written as

Vtp Vctrl Vdd k C t_d eff − − = ) [{ (3-4)

The period of a ring oscillator with N delay stages is approximately 2N times the delay per stage. This translates to a center frequency of

eff d vco NC Vtp Vctrl Vdd k Nt f 2 ] ) [( 2 1 − − = = (3-5)

The gain of the VCO is defined as the absolute value of the slope on the frequency-Vctrl curve. Thus, Kvco can be expressed as

vco vco f K Vctrl ∂ = ∂ (3-6)

As a result, the center frequency of the VCO is in direct proportion to (Vdd-Vctrl) and has no relationship to supply voltage. is independent of buffer bias current and the VCO has the first order linearity.

vco

(51)

Figure 3-12 Replica-feedback current source bias circuit

The VCO bias generator providing the bias voltage Vbn and Vbp is shown as Fig3-12. It is composed of an amplifier bias, a differential amplifier, a half-buffer replica and a control voltage buffer. The task of the framework is to adjust the bias buffer current and provide the correct Vctrl with lower swing limit for the buffer stage. In order to accomplish the target, the differential amplifier and the half-buffer replica form a negative feedback, and the voltage Vx equals the voltage Vctrl so that the output swings vary with the control voltage rather than is fixed. In order to track all variations at frequency for the PLL design, the bandwidth of the bias generator is typically set at least equal to the center frequency of the delay stages.

The bias generator also provides a buffered version of Vctrl at the Vbp output using an additional half-buffer replica. This output isolates the Vctrl from the potential capacitance coupling in the buffer stages. There is an important issue. The noticeable the supply-independent bias exists on the “degeneration” bias point. If all the transistors carry no current at beginning, they may remain indefinitely while the supply turning on. The reason is that the loop can get balance when all devices carry

(52)

no current. Therefore, an additional start-up circuit is necessary to propel the loop circuit out of the degenerate bias point.

Figure 3-13 Schematic of differential-to-single-ended converter

The differential-to-single-ended converter is shown in Fig 3-13. It consists of two opposite phase NMOS differential amplifier driving two PMOS common-source amplifier connected by NMOS current mirror. The first level NMOS differential amplifier amplifies the input differential-small signal to drive the next level PMOS amplifier and a single-ended full-swing signal is generated. The two differential amplifiers use the same current source bias voltage, Vbn, generated by the self-biased generator for the VCO. According to Vbn, the circuit corrects the input common-mode voltage level and provides signal amplification. The inverters are added at the output to improve the driving ability.

The duty-cycle corrector is connected behind the differential-to-single-ended converter to ensure that the duty-cycle of the VCO will be 50% and shown as Fig3-14[23]. This duty-cycle correction circuit consists of only two transmission

(53)

gates and two inverters, the area is minimal and the power consumption is negligible. The signal Vin+ selected from the multiphase signals turn on M3 and M4, and charges the output node Vout of the duty-cycle corrector almost instantaneously. Because the discharge path of the node Vin+ is already off due to the signal Vin-. The signal Vin-, which is also selected from the multiphase signals, is the one whose rising edge is shifted by 180° in phase from that of Vin+. Similarly, the signal Vin- rapidly discharges the node Vout and delivers the desired 50% duty-cycle signal. The advantage of duty-cycle corrector can apply to many aspects in this thesis, that will be described in the later section.

Figure 3-14 Schematic of duty-cycle corrector and its timing diagram

3.3.4 Loop Filter

The loop filter configuration used in this thesis is typically a low pass filter to suppress the high-frequency signal generated from PFD and the circuit is shown as Fig 3-15. The capacitance C0 in series with R1 provides a zero in the open loop

(54)

response. The additional zero can improve the phase margin and overall stability of the loop. The shunt capacitance C1 can suppress the discrete voltage pulse which disturbs the VCO operation. However, a large C1 can adversely affect the overall stability of the loop.

Figure 3-15 2nd order passive loop filter

3.3.5 Divider

In our PLL, we need a divided-by-2 circuit to double input reference frequency. We use a TSPC D-Flip-Flop and connect its inverted output to D input, and the circuit connection is shown as Fig 3-16(a)[24]. In this circuit we need to check input clock driving capability to make this circuit have correct operation. The scheme of the divider is shown as the Fig 3-16(b).

(a) (b)

(55)

3.4 Fundamentals of PLL

3.4.1 PLL Linear Model

Kcp HIp(s) Kvco/s ÷N Vref(s) + -Vout(s) e θ θout θout

Figure 3-17 PLL linear model

The phased-locked loop is a highly-nonlinear system. However, when the system in the lock mode. Its dynamic response to input-signal phase and frequency changes can be approximated by a linear model. Fig 3-17 shows the linear mathematical model representing the PLL is in the locked stage.

When the PLL is locked, the PFD as a provider produces a error phase difference defined as

2

p

I

π . The output voltage difference is proportional to the error phase

difference. The average of the error current within a cycle is

2 e d p i I θ π = , so that the

ratio of the output current to the input phase differential, Kcp is 2

p

I

π (A/rad). The loop

fliter has a transfer function Hlp(s) (V/A). in order to keep the mathematics simple, the parasitic shunting capacitance, may be omitted. Then the Hlp(s) can be simplified as 1 C 1 0 1 R sC

+ . Kv(Hz/V) is the ratio of the VCO frequency to the control voltage variation. Since the phase is the integral of frequency over time, Kv(Hz/V)

(56)

should be changed to 2 Kv Kvco

s s

π ₌

(rad/sec V). N is the divider parameter, the ration of the output frequency to reference input frequency.

The open-loop transfer function of the PLL can be represented as

( ) ( ) ( ) ( ) out in s IpKvHlp s G s s sN θ θ = = _(3-7)

From the feedback theory, the close-loop transfer function of the PLL can be found as

( ) ( ) ( ) ( ) 1 ( ) out in s G s H s N s G s θ θ = = + (3-8)

Then, the function 3-7 and 3-8 can be combined and the Hlp(s)= ₁ 0

1

R sC

+ is substituted into 3-7, then the combined function is shown as

1 0 0 2 1 0 ( )(1 ) ( ) ( ) IpKv sR C NC H s N IpKv IpKv s s R N N + = + + C (3-9)

This can be compared with the classical two-pole system transfer function 2 2 2 (1 ) ( ) 2 n z n n s H s N s s ω ω ζω ω + = + + (3-10)

Then, the parameters natural frequency ω_n, zero of the LP ω_z and damping factor ζ can be derived as 0 0 n IpKv KcpKvco NC NC ω = = (3-11) 1 0 1 z R C ω = _(3-12) 0 0 1 1 2 2 2 n z KcpKvcoC IpKvC R R N N ω ζ ω = = = (3-13)

In a 2nd –order system, the loop bandwidth of the PLL is determined by ω_n. But the -3dB bandwidth should be IpKvcoC0 _(Hz)

K

N

(57)

factor, a large one will bring about response sluggishness and longer time for locking. To the other end, if the value is too small, oscillation for step response will make the system unstable. For the compromise between the two end, ζ =1.414 is adopted for this work.

3.4.2 PLL Noise Analysis and Stability

e θ ref( )s θ n i ( )s v ( )n s θn( )s out( )s θ out( )s θ

Figure 3-18 PLL linear model with various equivalent noise sources

The transfer function can be derived for disturbances injected at various points in PLL as shown as in Fig 3-18. There are three interference sources, in(s), vn(s) and θn(s).

The first one is that the current variation injected at the output of the charge pump and the phase detector. The second one is that voltage noise injected at the output of the filter. The third one is that the phase errors injected by the VCO. The table 3-1 shows the response equations of the three interference sources.

source Noise transfer function

in(s) 2 2 ( )(1 ( ) ( ) ( ) 2 out i n n Kvco sRC s _C H s i s s s θ ) n ζω ω + = = + + (3-14) vn(s) 2 2 ( ) ( ) ( ) 2 out v n n s sKvco H s v s s s _n θ ζω ω = = + + (3-15) θn(s) 2 2 2 ( ) ( ) ( ) 2 out n n s s H s s s s θ θ n θ ζω ω = = + + (3-16)

(58)

From the observation of the above equation, the transfer function, , and are respectively low-pass, band-pass and high pass[25][26]. In order to reduce the noise impact, there is one way to increase the loop bandwidth

( ) H s_θ ( ) v H s H s_θ( ) n ω

by increasing the factor Kcp. However, the maximum ωn is restricted by the update

frequency ω of the phase detector. From the analysis of the research [19], the _ref criteria of the stability limit can be derived as:

2 2 ( ) ref n ref RC ω ω π ω π < + (3-17)

In general, ω_n_{is approximately less than 110 of phase detector update} frequency ω to avoid the instability. So the restriction of the maximum frequency ref

of loop filter is 1

10

n ref

ω < ω .

3.5 Loop Parameters Consideration

After describing each building block in detail, it is noticed that the set of the loop parameters is highly relative to the system performance and is needed to be considered carefully. Refer to the derivation of the transfer function and the noise analysis just mentioned. There are two terms needed to be satisfied for the stability of the PLL system, and for the simplification of the system order from third order to second order to be accurate. First, the capacitor in the loop filter shunt on control voltage for suppression purpose must be much smaller than the filtering capacitance. This is can be explained by the function 3-18 as shown below:

(59)

0 1 1 z C R ω = ; 0 1 0 1 0 1 1 1 1 p z C C C R C C C ω = ⋅ + =ω ⎛⎜ + ⎟⎞_⎟ ⎝ ⎠ (3-18)

If , the higher frequency pole induced by can be ignored. Second, as proposed in [27], the (3-16) must be satisfied for the system stability. As a rule, it is true that by keeping

0 20

C > C₁ C₁

10

ref n

ω > ω , stability in discrete-time model as well as in continuous-time model can be assumed. Under such premise, the remaining loop parameters are be taken into consideration, specifically, natural frequencyωn ,

damping factor ζ and the most one, the phase margin of the open loop system.

Fig 3-19 shows the curve for the open loop PLL frequency response. This curve gives the phase margin of approximate 70∘. The total parameters of the PLL are listed in the Table 3-2. The simulation Vctrl timing diagram and transfer characteristic are shown as the Fig 3-20 and Fig 3-21. The supply voltage used is 3.3V and the Vctrl is in the region of 1.0V to 2.0V. The gain of the VCO, Kvco is 130 MHz/V.

(60)

Figure 3-20 Vctrl timing diagram 0 100 200 300 400 500 600 0 0.5 1 1.5 2 2.5 3 Vctrl MH z tt ff ss

(61)

Charge Pump Current (Icp) 120uA VCO Center Frequency (Fvco) 130MHz

KVCO 130MHz/V

Divided by N 2

Loop Bandwidth 4000kHz

Phase Margin 70 degrees

Parameter of Loop Filter C0=84.81p F

C1=2.72p F

R1=2.66k ohm

(62)

(63)

Chapter 4 Transmitter

4.1 Architecture of Transmitter

Figure 4-1 Block diagram of the transmitter

Fig 4-1 shows the components of the transmitter. The transmitter is built up by a PRBS circuit, a PLL, a multiplexer, a 8 to 1 multiplexer with pre-skew circuit and a data driver. The purpose of the Pseudo Random Bit Sequence circuit (PRBS) circuit is to generate series of testing data. There is a 2 to 1 multiplexer to select the input data from the testing data or actual channel data. With the 8 to 1 multiplexer, we can reduce the frequency requirement of the timing circuit and we can serialize the parallel and low-speed data to be a 1.2Gbs, high speed, serial transmission data by the eight-phase, 150MHz clock signals generated by PLL. A pre-skew circuit is needed to avoid that the multiplexer samples the data at the transient. Finally, through the data

(64)

driver, the data stream is transmitted out with a nominal swing of 200mV. In the following section, we will introduce the circuit and the function of each block in the transmitter architecture in detail.

4.2 Pseudo Random Bit Sequence (PRBS)

clk

in

D D_in

out Q

Figure 4-2 PRBS delay cell circuit

The Pseudo Random Bit Sequence is designed for generating a sequential data in random for testing. The delay cell of PRBS is shown as Fig 4-2. With a series delay cell, each delay cell can supply a signal for next delay cell and so on. The signal from the XOR can renew the cycle and delay cells will generate the new data. Thus, PRBS can generate a random pattern. In fact, repetition of the pattern exists and the pattern repeats every -1=127 clock cycles. We also note that if the initial condition of each delay cell is zero, PRBS remains in the degenerate state. Therefore, a signal SET is needed to start up the PRBS. Then we use the outputs of the seven delay cells and XOR gate to form eight parallel input data of transmitter. And the architecture is shown in Fig 4-3.

7

(65)

Figure 4-3 Scheme of Pseudo Random Bit Sequence (PRBS)

4.3 Multiplexer (8 to 1)

4.3.1 The Algorithm for Parallel to Serial

D0 D1 D2 D3 D4 D5 D6 D7 D6 D7 D0

clk0

clk1

clk2

clk3

clk4

clk5

clk6

clk7

Out

stream

Figure 4-4 Timing diagram of 8:1 multiplexer

When the PLL produces eight-phases 150MHz clock signal, we can make the serial data stream with 1.2Gbps and the relationship between clk0~clk7 and output data stream is shown in Fig 4-4. In this thesis, a 3-levels MUX is used to realize

(66)

8-parallel data to one serial data and it is shown in Fig 4-6. Therefore, the algorithm for the timing schedule and function of each MUX cell is necessary to be considered. As the shaded area in the Fig 4-4, when the clk(1,2,6,7) is on, the D0 is given to out stream. It is similar to D1, D2 ….D7. We list the total relationship in a table4-1 that can help to understand the logic function of the 3-levels MUX.

Clock on (level 1) Critical clock Clock on (level 2) Critical clock Clock on (level 3) Critical clock D0 (0,1,6,7) D1 (0,1,2,7) (6,2) D0 (0,1,6,7) D1 (0,1,2,7) D2 (0,1,2,3) D3 (1,2,3,4) (0,4) D2 (0,1,2,3) D3 (1,2,3,4) (7,3) D0 (0,1,6,7) D1 (0,1,2,7) D2 (0,1,2,3) D3 (1,2,3,4) D4 (2,3,4,5) D5 (3,4,5,6) (2,6) D4 (2,3,4,5) D5 (3,4,5,6) D6 (4,5,6,7) D7 (5,6,7,0) (4,0) D6 (4,5,6,7) D7 (5,6,7,0) (3,7) D4 (2,3,4,5) D5 (3,4,5,6) D6 (4,5,6,7) D7 (5,6,7,0) (1,5)

Table 4-1 the deductive logic of 3-levels multiplexer

As shown in above table, in the first MUX level, we need to separate the adjacent input data. For example, to observe the difference between D0 and D1, we can find that except the clk(6,2), the others are the same. So we can define clk(6,2) is significant to D0 and D1 and is critical to separate D0 and D1. In the second level, the D0 and D1 are classified as the same type and D2 and D3 are classified as the same type. Then, observing D0~D3, clk(7,3) is the critical clock in the second level MUX. Similarly, the D0~D3 and D4~D7 can be separated into two groups by the same way.

(67)

Table 4-1 shows the flowchart of how to deal with the data through the 3-levels MUX. The critical clocks of each level are marked. It is useful for us to infer the algorithm and construction of the 3-level MUX.

4.3.2 MUX Architecture

Figure 4-5 Multi-phase Type MUX

1.2Gbps更小擺幅差動訊號傳輸模式收發器

國 立 交 通 大 學

電子工程學系 電子研究所碩士班

碩 士 論 文

1.2Gbps 更小擺幅差動訊號傳輸模式收發器

A 1.2Gbps RSDS Serial-link transceiver

研 究 生 : 邱啟祐

指導教授 : 吳錦川 教授

中華民國九十四年八月

1.2Gbps 更小擺幅差動訊號傳輸模式收發器

A 1.2Gbps RSDS Serial-link transceiver

研 究 生： 邱啟祐 Student :

Chi-Yu

Chiu

指導教授： 吳錦川 教授 Advisor : Prof. Jiin-Chuan Wu

國立交通大學

電子工程學系 電子研究所碩士班

碩士論文

A Thesis

Submitted to Department of Electronics Engineering & Institute of

Electronics

College of Electrical Engineering and Computer Science

National Chiao Tung University

In Partial Fulfillment of the Requirements

for the Degree of

Master of Science

In

Electronic Engineering

Aug 2005

Hsin-Chu, Taiwan, Republic of China

1.2Gbps 更小擺幅差動訊號傳輸模式收發器

學生:邱啟祐

指導教授:吳錦川博士

國立交通大學電子工程學系 電子研究所碩士班

摘要

A 1.2Gbps RSDS Serial-link transceiver

Student: Chi-Yu Chiu Advisor: Prof. Jiin-Chuan Wu

Department of Electronics & Institute of Electronics

National Chiao-Tung University

Abstract

誌謝

Contents

Abstract (Chinese)

Abstract (English)

Contents

List of Tables

List of Figures

Chapter 1

Introduction

Chapter 2

Background

Chapter 3

Phase-Locked-Loop

Chapter 4

Transmitter

Chapter 5

Receiver

Chapter 6

Conclusion and Future work

LIST OF TABLES

LIST OF FIGURES

Chapter 1

Introduction

1.1 Motivation

1.2 Introduction of RSDS

1.2.1 RSDS/LVDS

1.2.2 Applications RSDS/LVDS

1.2.3 The Trend of RSDS

1.2.4 The Benefits of RSDS

1.3 Thesis Organization

Chapter 2

Background

2.1 RSDS Specification

2.2 Basic Serial Link

2.3 Noise Issue

2.3.1 Cross-talk

2.3.2 Reflection

V

=

ρ

國立交通大學

電子工程學系電子研究所碩士班

碩士論文

研究生 : 邱啟祐

指導教授 : 吳錦川教授

研究生：邱啟祐 Student :

指導教授：吳錦川教授 Advisor : Prof. Jiin-Chuan Wu

電子工程學系電子研究所碩士班

國立交通大學電子工程學系電子研究所碩士班