1.6 Gbps 更低擺幅差動訊號傳輸之傳送器
全文
(2)
(3) 1.6 Gbps 更低擺幅差動訊號傳輸之傳送器 A 1.6Gbps RSDS Serial-Link Transceiver. 研 究 生: 鍾竣帆. Student :. Chun-Fan Chung. 指導教授: 吳錦川 教授. Advisor : Prof. Jiin-Chuan Wu. 國立交通大學 電子工程學系 電子研究所碩士班 碩士論文. A Thesis Submitted to Department of Electronics Engineering & Institute of Electronics College of Electrical Engineering and Computer Science National Chiao Tung University In Partial Fulfillment of the Requirements for the Degree of Master of Science In Electronic Engineering September 2005 Hsin-Chu, Taiwan, Republic of China. 中華民國九十四年九月. I.
(4) II.
(5) 1.6 Gbps 更低擺幅差動訊號傳輸之傳送器 學生:鍾竣帆. 指導教授:吳錦川 博士. 國立交通大學電子工程學系 電子研究所碩士班. 摘要 隨著積體電路製程技術的進步,對於需要高頻寬和低延遲晶片之間資料傳輸 也隨之增加,傳輸介面的電路所能達到的單位時間最大傳輸量往往是整體系統速 度的關鍵限制。 本篇論文是描述一個應用於高速串列數位影像傳輸介面,使用低擺幅差動訊 號傳輸之傳送器的設計,並致力於設計兩種資料傳輸速度操作在 1.6Gbps 的傳送 器,這兩種傳輸器的差別在於傳輸時脈的不同,第一種傳輸器傳 100MHz 的時 脈,第二種傳 800MHz 的時脈。 傳送器由一個八相位鎖相迴路、虛擬隨機位元串列產生器、四對一多工器、 時脈處理電路、輸出資料和時脈驅動器所組成,其中,八相位鎖相迴路的輸入頻 率為 400MHz,輸出為八個相位,平均分佈且頻率同為 400 MHz 的時脈訊號,所 包含的電路有相位/頻率偵測器、電荷幫浦、迴路濾波器、四級差動壓控振盪器 和一個除四的除頻器。此鎖相迴路所產生的平均分佈時脈提供給虛擬隨機位元串 列產生器和四對一多工器,並將一組並列資料轉為串列輸出,時脈處理電路將時 脈處裡過後,最後,將此時脈及串列資料傳送至傳輸線上,即完成整個傳送器的 設計。 接收器使用具有磁滯現象的比較器將傳送過來的資料和時脈放大成數位訊 I.
(6) 號。然後,第一種接受器使用 100MHz 產生平均分佈且頻率同為 400 MHz 的時脈 訊號來取值, 第二種接受器使用輸入資料頻率一半的時脈 800MHz 來取值。最後,解多工器將 時脈資料回復電路的輸出轉變成八個平行資料通道。. II.
(7) A 1.6Gbps RSDS Serial-Link Transceiver. Student: Chun-Fan Chung. Advisor: Prof. Jiin-Chuan Wu. Department of Electronics & Institute of Electronics National Chiao-Tung University Abstract As the IC fabrication technology advances, the need for high-bandwidth and low-latency inter-chip data transfer has also increased. Most of time, the key limitation of a whole system is the maximum data amounts of the transmission interface circuit transmitted in each unit time. This thesis describes the design of a high-speed serial link I/O interface. We have devoted to design two types of the transceiver at 1.6Gbps.The difference between Type 1 transceiver and Type 2 transceiver is the frequency of the output clock. Type 1 transceiver transfers 100MHz clock; Type 2 transceiver transfers 800MHz clock. The transmitter is composed of a eight-phase PLL, PRBS circuits, 4-1 multiplexers, clock process circuit and an output data and clock driver. Among these devices, the input frequency of the eight-phase PLL is 100MHz, and it outputs eight uniformly distributed clocks with 400 GHz frequency. The PLL consists of a III.
(8) Phase/Frequency Detector, a Charge Pump, a Loop Filter, a four-stage differential VCO and a divided-by-four divider. It offers the PRBS and the 4-1 multiplexer with four uniformly distributed clocks to convert parallel pseudo-data into serial stream. Then, the serial data is transmitted by an output data driver. In the end, the transmitter drives the serial data and clock onto the transmission bus. The receiver uses the comparator with hysteresis to amplify the incoming data and clock to full swing. Then, Type 1 receiver uses 100MHz clock to generate four uniformly distributed clocks with 400 GHz frequency to sample data. Type 2 receiver uses 800MHz operating at half of the input data rate Finally, the de-multiplexer converts the serial outputs to four parallel data channels.. IV.
(9) 誌謝. 首先,我要感謝我的指導老師吳錦川教授,在碩士班兩年的研究生涯中,悉 心地指導我,不論是專業知識的培養,或是做研究的態度和處理問題的方法,都 讓我獲益良多。其次,也要感謝陳巍仁教授、藍正豐學長、邱煥凱學長撥冗擔任 我的口試委員,並且提供我不少寶貴的意見。 論文研究能夠完成,要感謝在307實驗室的諸多學長,謝謝你們這兩年的指 導,並要感謝阿瑞、周政賢、權哲等學長的教導,讓我獲益良多,在此衷心的感 謝你們。還要感謝一同在527奮鬥的夥伴,鍵樺、志朋、傑忠、啟佑、靖驊、弼 嘉、建樺、阿信、瑋銘、岱原、吳諭,特別感謝同屬吳錦川老師旗下的各位伙伴 們,在平時一起研究討論而在研究之餘能夠互相打氣並一同歡樂,使的課業繁重 的研究生生活增添了許多的樂趣與活力, 另外要感謝我的父母與我的家人,謝謝父母從小以來栽培我所花的勞心與勞 力,並在我繁忙與失意的時候給我最大的支持與鼓勵,並給予我許多的人生方向 上的建議。. 謹以此篇論文獻給所有關心我的人與我所關心的朋友。. 鍾竣帆 國立交通大學 中華民國九十四年九月 1.
(10) CONTENTS ABSTRACT(CHINESE) ABSTRACT(ENGLISH) ACKNOWLEDGEMENT CONTENTS TABLE CAPTIONS FIGURE CAPTIONS Chapter 1 Introduction. ............................................................... 1. 1.1 MOTIVATION...................................................................................................... 1 1.2 Thesis Organization .......................................................................................... 2. Chapter 2 Background ................................................................... 3 2.1 Basic Serial-link Transceiver............................................................................ 3 2.1.1 Architecture................................................................................................ 3 2.1.2 High-speed And Low-Power Transceiver Circuits Design........................ 4 2.2 Signaling Circuits.............................................................................................. 5 2.3 INTRODUCTION TO RSDS................................................................................... 7 2.3.1 Scope .......................................................................................................... 7 2.3.2 Electrical Specification and Bus................................................................ 8 2.3.3 Application................................................................................................. 12 2.4 Basic Link Design............................................................................................. 14. Chapter 3 Phase-Locked Loop. .......................................... 15. 3.1 INTRODUCTION .................................................................................................. 15 3.2 ARCHITECTURE of PLL .............................................................................. 16 3.3 CIRCUIT IMPLEMENTATION...................................................................... 17 3.3.1 Phase Frequency Detector......................................................................... 17 3.3.2 Charge Pump ............................................................................................. 20 3.3.3 Loop Filter ................................................................................................. 21 2.
(11) 3.3.4 Voltage-Controlled- Oscillator.................................................................. 22 3.3.5 Voltage-Controlled Oscillato..................................................................... 29 3.4 PLL PARAMETER DESIGN .................................................................................. 31 3.5 PLL NOISE ANALYSIS AND STABILITY................................................... 37. Chapter 4 Transmitter ..........................................................................39 4.1 Architecture....................................................................................................... 39 4.2 Pseudo Random Bit Sequence .......................................................................... 42 4.3 Four-to-One Multiplexer................................................................................... 43 4.3.1 Data Pre-skew to 4:1 Multiplexer.............................................................. 46 4.3.2 Mux Delay.................................................................................................. 47 4.4 Clock Process Circuit........................................................................................ 47 4.5 Data Driver........................................................................................................ 49 4.6 Transmitter Simulation Result .......................................................................... 51 4.6.1 Simulation Environment............................................................................. 51 4.6.2 Simulation Result of PLL ........................................................................... 52 4.6.3 Simulation Result of Transmitter ............................................................... 53. Chapter 5 Receiver ............................................................................. 55 5.1 Architecture of Receiver ................................................................................... 55 5.2 Slicer ................................................................................................................. 57 5.3 Circuit Implementation ..................................................................................... 59 5.3.1 Type 1 receiver ........................................................................................... 59 5.3.2 Type 2 receiver ........................................................................................... 61 5.4 Receiver Simulation Result............................................................................... 63 5.4.1 Type 1 receiver........................................................................................... 64 5.4.2 Type 2 receiver........................................................................................... 66. Chapter 6 Conclusion and Future work 6.1 Conclusions........................................................................................................... 69 6.2 Future Works......................................................................................................... 70. References...................................................................................................... 71. 3.
(12) Figure Fig. 2-1 Block diagram of the basic serial link..............................................................4 Fig. 2-2 Reduce noise and suppress EMI effect by using the differential transmission technology. .....................................................................................................................5 Fig. 2-3 Transmitter with different transmitter architectures: voltage-mode (a), current-mode (b), and differential (c) ............................................................................6 Fig. 2.4 The RSDS interface configuration. The interface contains three parts: a transmitter, receivers and a balanced interconnecting medium with a termination.......8 Fig. 2.5 Type 1 bus configuration of RSDS. ................................................................10 Fig. 2.6 Type 2 bus configuration of RSDS. ................................................................11 Fig. 2.7 Type 3 bus configuration of RSDS. ................................................................12 Fig. 2.8 The RSDS interface utilized in the flat panel display systems.......................13 Fig. 2.9 Block diagram of the basic serial link ............................................................14 Fig 3-1 Basic PLL Architecture ...................................................................................16 Fig 3-2 State Machine of PFD .....................................................................................17 Fig 3-3 PFD Implementation .......................................................................................18 Fig 3-4 TSPC D-FF in PFD .........................................................................................18 Fig 3-5 Dead Zone of PFD...........................................................................................19 Fig 3-6 simulation result of PFD without dead zone ...................................................20 Fig 3-7 Schematic of the Charge Pump .......................................................................21 Fig 3-8 Loop Filter.......................................................................................................22 Fig 3-9 Schematic of the four stages VCO and the delay cell .....................................23 Fig 3-10 I-V curve of the symmetric load....................................................................23 Fig 3-11 Schematic of self-biased replica-feedback bias generator.............................26 Fig 3-12 Transfer Curve of the VCO ...........................................................................27 Fig 3-13 Schematic of differential-to-single-ended converter.....................................28 Fig. 3-14 Schematic of feed forward type duty-cycle corrector and its timing diagram ......................................................................................................................................28 Fig. 3-15 Schematic of TSPC Asynchronous Divided-by-two circuit.........................30 Fig. 3-16 Divider composed of asynchronous and synchronous counters and its timing diagram ........................................................................................................................30 Fig. 3-17 Linear Model of PLL....................................................................................31 Fig. 3-18 Open Loop gain simulation of the PLL........................................................36 Fig 3-19 Fig. 3-19 Simulation of the four output clock signals of the PLL………….36 Fig. 3-20 Simulation of the eight output clock signals of the PLL..............................37 Fig. 3-21 Linear model of PLL with different noise sources .......................................37 4.
(13) Fig. 4-1. Block diagram of type 1Transmitter..............................................................39 Fig. 4-2. Block diagram of type 2Transmitter..............................................................39 Fig. 4-3 pre-skew of parallel data ................................................................................40 Fig 4-4 Type 1 clock (100 MHz) .................................................................................40 Fig 4-5 Type 2 clock (800 MHz) .................................................................................41 Fig. 4-6 block diagram of Pseudo Random Bit Sequence (PRBS)..............................42 Fig. 4-7 PRBS delay cell circuit ..................................................................................42 Fig. 4-8 Timing diagram of 4:1 multiplexer………………………………………….43 Fig.4-9 Block Diagram of 4-1 Multiplexer..................................................................44 Fig. 4-10 Block of 2-1 MUX .......................................................................................45 Fig. 4-11 Schematic of 2-1 MUX ................................................................................45 Fig. 4-12 Timing diagram of Pre-skew ........................................................................46 Fig. 4-13 schematic of MUX delay..............................................................................47 Fig. 4-14 TSPC delay cell……………………………………………………………48 Fig. 4-15 Timing diagram of clk and data…………………………………………...49 Fig.4-16 (a) Schematic diagram of the RSDS transmitter data driver. (b) Common mode feedback circuit. .................................................................................................50 Fig. 4-17 testing environment on board.......................................................................51 Fig.4-18 eight-phase VCO clock of PLL .....................................................................52 Fig.4-19 eye diagram of the transmitter output waveform ..........................................52 Fig.4-20 eye diagram of output (Without data preskew) .............................................53 Fig.4-21 Simulation result of the transmitter output waveform ..................................54 Fig. 5-1 Block diagram of Type 1 receiver ..................................................................56 Fig. 5-2 Block diagram of Type 2 receiver ..................................................................56 Fig. 5-3 Schematic of slicer .........................................................................................57 Fig. 5-4 Frequency response of slicer ..........................................................................58 Fig. 5-5 Hysteresis window of the slicer .....................................................................58 Fig 5-6 Basic PLL Architecture ...................................................................................59 Fig. 5-7 Timing diagram of data stream and clock…………………………………...60 Fig. 5-8 Block Diagram of 1-4 De-multiplexer………………………………………61 Fig. 5-9 Asynchronous tree-type 1:4 de-multiplexer....................................................62 Fig. 5-10 (a) 1:2 DEMUX module and (b) timing diagram.........................................62 Fig. 5-11 Schematic of positive latch...........................................................................63 Fig. 5-12 Time domain of the received signal and output of the slicer………………64 Fig. 5-13 Timing diagram of the received data and clock, and the clock which PLL generate for de-multiplexer ………………………………………………………...65 Fig. 5-14 Control voltage of the PLL in the TYPE 1 receiver.....................................65 Fig. 5-15 Four parallel data outputs of the Type 1 receiver .........................................66 5.
(14) Fig. 5-16 Timing diagram of the received data and clock……………………………67 Fig. 5-17 Four parallel data outputs of the Type 2 receiver .........................................67. 6.
(15) Chapter 1 Introduction. 1.1 MOTIVATION As the IC fabrication technology advances, the internal clock frequency in microprocessors is up to gigabits-per-second range. However, unlike internal clocks, chip-to-chip or chip-to-board signaling gains little benefit in terms of operating frequency from the increased silicon integration. These advancements have led to some chips being limited by the chip-to-chip data communication bandwidth.. Because of the performance of many digital systems is limited by the interconnection bandwidth between different modules and chips, high-speed data links play a key role of the whole system.. In the last decade, high-speed I/O interfaces were achieved by massive parallelism with the disadvantages of increased complexity and cost for the IC package and the printed circuit board (PCB). However, such method also consumes huge power and induces unavoidable electro-magnetic interference (EMI) during signal transmission. 1.
(16) In order to save power, area and cost, the number of I/O pads in systems should be reduced. Therefore, parallel-based technologies need to be changed to serial-based technologies. The serial link technology can lower the numbers of transmission lines to decrease power, volumes, cost and EMI. The population applications are optical communication, USB, IEEE-1394, TMDS, PECL, LVDS and RSDS.. 1.2 Thesis Organization The thesis is organized into six chapters. Chapter 1 introduces the motivation and the organization of this thesis. Chapter 2 describes the background behind this thesis research. It also discusses the reduced-swing differential signaling (RSDS) standard. The detail DC specifications and applications of both standards are presented. In Chapter 3, the conception and architecture of Phase-Locked Loop (PLL) will be described. Chapter 4 shows the whole architecture of the transmitter including the simulation results. Chapter 5 presents the building block of receiver. Chapter is the conclusion of this thesis and shows the future work.. 2.
(17) Chapter 2 Background. 2.1 BASIC SERIAL-LINK TRANSCEIVER 2.1.1 ARCHITECTURE The components of basic serial-link transceiver architecture are a parallel-to-serial conversion circuit, a transmitter, a channel, a receiver, and a serial-to-parallel conversion circuit, as shown in Fig. 2-1. The data before transmitted are usually parallel data stream in order to increase the bandwidth of the link. Therefore, a parallel-to-serial conversion circuit is needed before sending data to the transmitter. The transmitter converts digital information to analog signal on the transmission medium. This medium which signals travel on is commonly called the communication channel such as the coaxial cable or the twisted pair cable. The receiver on the other end of the channel recovers the signal to the original digital information by amplifying and sampling the signal.. The termination resistors which match the impedance of. the channel can minimize signal reflection in order to have better signal quality. The clock circuit at the receiver is used to adjust the receiver clock based on the received clock to let the sampling point at the middle of the received data. Then, a serial-to-parallel conversion circuit is used to convert the serial data back to N parallel bits in order to be processed by following digital circuits. 3.
(18) Fig. 2-1 Block diagram of the basic serial link. 2.1.2. HIGH-SPEED. AND. LOW-POWER. TRANSCEIVER CIRCUITS DESIGN. A high performance transceiver circuit must consider speed, power consumption, cost and noise. However, the four factors are trade-off to each other. On balance, the low output signal swing and differential data transmission are the good choice for designing a high performance transceiver circuit. Signals transmitted with low voltage swing can minimize power dissipation and enable operation at very high speed. The differential transmission can provide adequate noise margin in practical systems since signals are transmitted with low voltage swing. The controversial point is to use the differential transmission. It costs twice of connectors and transmission lines. However, reliable single-ended signals require many ground pins (many high-speed chips provide one ground pin for every two signal pins) and run significantly slower. And for noise concern, any noise that is coupled into both transmission lines of the signal path will be rejected at node 1 and node 2 in receiver, as shown in Fig. 2-2, due to the common-mode rejection of the differential amplifier. Besides, for EMI aspect, differential signals tend to radiate less 4.
(19) EMI than single-ended signals due to the canceling of magnetic fields. There are currently many different transmission technologies that are applied for different I/O interfaces. The comparison between these different transmission technologies is shown in Table 1.2.. Fig. 2-2 Reduce noise and suppress EMI effect by using the differential transmission technology.. 2.2 Signaling Circuits The transmitter drives a HIGH or LOW analog voltage onto the channel and is designed for a particular output-voltage swing based on the system specification. The design issues are to maintain small voltage noise and timing noise on the signal. There are two types of output drivers to drive the output: voltage-mode drivers and current-mode drivers. Voltage-mode drivers, as shown in Fig. 2-3 (a), are switches that switch the line voltage. Because the switches are implemented with transistors, the driver appears as a switched resistance. To switch the voltage fully, a small resistance is needed which typically requires a large switching device. In contrast, 5.
(20) current-mode drivers, as illustrated in Fig 2-3 (b), are switching current sources. The output impedance of the driver is much higher than the line impedance. It is also called high impedance signaling. Therefore, the transmitter bandwidth is typically not an issue even with significant output capacitance. The voltage to be transmitted on the line is determined by the switched current and the line impedance or an explicit load resistor. The driver can be simply implemented by biasing the MOS transistor in its saturation region. Current-mode drivers are slightly better in terms of insensitivity to supply-power noise because they have high output impedance and hence the signal is tightly coupled only to VOH, the signal return path. The output current does not vary with ground noise as long as the current source bias signal is tightly coupled to the ground signal. The disadvantage with current-mode drivers is that, in order to keep the current sources in saturation, the transmitted voltage range must be well above ground that increases power dissipation.. Fig. 2-3 Transmitter with different transmitter architectures: voltage-mode (a), current-mode (b), and differential (c) 6.
(21) For better supply-noise rejection, the differential mode can be adopted, as shown in Fig. 2-3 (c), because the supply noise is now common-mode. Since the current remains roughly constant, the transmitter induces less switching noise on the supply voltage that could benefit other transmitted or received signals on the same die. To reduce reflections at the end of the transmission line, the transmitter needs to be terminated. An off-chip termination resistor could introduce significant impedance mismatches because of the package parasitic components. To incorporate the resistor, with current-mode drivers, an explicit on-chip resistor at the driver can act as the termination resistor. If a resistive layer is not available, a transistor in its linear region can be used as the resistor. With voltage-mode drivers, the design is slightly more complex because the switch resistance should match the line impedance Z0. This may be done either through proper sizing of the driver or by over-sizing the driver and compensating with an external series resistor, as shown in the Fig. 2-3 (a).. 2.3 INTRODUCTION TO RSDS 2.3.1 Scope Reduced Swing Differential Signaling (RSDS) is a signaling standard that defines the output characteristics of a transmitter and inputs of a receiver along with the protocol for a chip-to-chip interface between flat panel timing controllers and flat panel column drivers. RSDS technology is originated from the LVDS technology .The RSDS interfaces tend to be used in flat panel display applications with resolutions between VGA (600 × 480 pixels) and UXGA (1600 × 1200 pixels). The RSDS technology provides many benefits to flat panel display applications which include following items: 7.
(22) Reduced bus width – enables smaller and thinner flat panel column driver boards. Low power dissipation – extends system run time. Low EMI generation – eliminates EMI suppression components and shielding. High noise rejection – maintains signal image. High throughput – enables high resolution flat panel displays. RSDS is a differential interface with a nominal signal swing of 200 mV. It retains the many benefits of the LVDS interface which is commonly used between the host and the panel for a high bandwidth, robust digital interface. The RSDS applications are within a sub-system, the signal swing is reduced further from LVDS to lower power even further.. 2.3.2 Electrical Specifications and Bus Configurations. Fig. 2.4 The RSDS interface configuration. The interface contains three parts: a. 8.
(23) transmitter, receivers and a balanced interconnecting medium with a termination.. An RSDS interface circuit is shown in Fig. 2.4. The interface contains three parts: a transmitter, receivers and a balanced interconnecting medium with a termination. The transmitter and receiver are defined in terms of direct electrical measurements in Table 2.1. The RSDS is a versatile interface that may be configured differently depending upon the end application requirements.. Table 2.1 Electrical specifications of RSDS transmitter and receiver. Considerations include the location of the timing controller (TCON), the resolution and the color depth of the flat panel displays. The common implementations include the following bus types: Type 1 – Multi-drop bus with double terminations. Type 2 – Multi-drop bus with single end termination. 9.
(24) Type 3 – Double multi-drop bus with single termination. In a type 1 configuration, the source (TCON) is located in the middle of the bus via a short stub as shown in Fig. 2.5. The bus is terminated at both ends with a nominal termination of 100 Ω. The interconnecting medium is a balanced coupled pair with nominal differential impedance of 100 Ω. The number of RSDS data pairs is 9 or 12 depending upon the color depth supported. In this configuration, the RSDS driver which is at the output part of the TCON will see a DC load of 50 Ω instead of 100 Ω.For this case, output drives of the RSDS driver must be adjusted to comply to the VOD specification with the 50 Ω load presented by the type 1 configuration.. Fig. 2.5 Type 1 bus configuration of RSDS.. In a type 2 configuration, as shown in Fig. 2.6, the source (TCON) is located at one end of the bus. The bus is terminated at the far end with a nominal termination of 100 Ω. The interconnecting medium is a balanced coupled pair with nominal differential impedance of 100 Ω. The bus may be a single or dual bus depending upon the resolution of flat panel displays. The number of RSDS data pairs is 9 or 12 10.
(25) depending upon the color depth supported for a single bus. Or the number of RSDS data pairs is 18 or 24 depending upon the color depth supported for a dual bus.. Fig. 2.6 Type 2 bus configuration of RSDS.. In a Type 3 configuration, the source (TCON) is located in the center of the application. There are two buses out of the TCON that run to the right and left respectively. Each bus is terminated at the far end with a nominal termination of 100 Ω. The interconnecting medium is a balanced coupled pair with nominal differential impedance of 100 Ω. The number of RSDS data pairs is 9 or 12 depending upon the color depth supported for a single bus for each bus. The connection of the TCON to the main line is not a stub in this configuration, but rather is part of the main line. This helps to improve signal quality as shown in Fig. 2.7.. 11.
(26) Fig. 2.7 Type 3 bus configuration of RSDS. From Fig. 2.5 to Fig. 2.7, the complete bus is not illustrated, only a single RSDS pair is shown. The number of column drivers on the bus is also application specific and depends upon the resolution of flat panel displays. 2.3.3 Applications. RSDS like its predecessor LVDS, originated from the unique need of the LCD manufacturers for on glass interface with higher speeds, reduced interconnects, lower power, and lower EMI. As shown in Fig. 2.8, the RSDS drivers are embedded at the output of the flat panel timing controller and the RSDS input buffers are at the input of the flat panel column drivers. Since this new technology also uses a low voltage differential swing (+/- 200 mV), lower EMI and lower power consumption can also be realized. Also due to its low voltage swing (versus TTL), faster clock rates can be achieved and thereby enabling higher resolution of FPDs in the future. At present clock rates of 65 MHz have been EMI qualified in pre-production TFT LCD modules. 12.
(27) with relative ease when compared to their TTL counter parts. In the near future, higher clock rates in excess of 85 MHz or even 100 MHz plus can be expected. Since this interface is a serial interface, overall bus width is also reduced by half of the conventional TTL bus architecture. In a TTL 6 bit/color dual bus architecture, a total of 36 data lines plus 2 clock signals are required, for a total of 38 conductors. In an equivalent RSDS architecture, only one bus consisting of a total of 9 differential pairs of data lines plus a differential clock pair are required, for a total of 20 conductors. When implementing the same system with RSDS, an overall reduction of 47 % in bus conductors are achieved thereby enabling a small outline PCBs.. Fig. 2.8 The RSDS interface utilized in the flat panel display systems.. 13.
(28) 2.4 Basic Link Design A general serial link is composed of three primary components:a transmitter, channels, and a receiver, as shown in Fig. 2-7. The data before transmission are usually arranged in parallel form in order to increase the bandwidth of the link. The transmitter has to convert the parallel data into serial stream before the output driver drives signals onto the channels. RSDSTM uses differential data transmission to deliver the serial data stream and the transmitter is configured as a switched-polarity current generator. A differential load resister at the receiver end provides current-to-voltage conversion. For operation in the Gbps range, an additional termination resistor is usually placed at the source (transmitter) end to suppress reflected waves caused by crosstalk or by imperfect termination, due to package parasitic effect and component tolerance. Moreover, RSDSTM uses a lower voltage swing to achieve further advantages in terms of reduced crosstalk and radiated EMI. Therefore, the double termination scheme is used and the termination resistors are integrated in the transmitter (RT-T) and in the receiver cell (RT-R) [5].. channel N. Parallel to Serial. TX. RX. RL. Serial to Parallel. N. Fig. 2.9 Block diagram of the basic serial link. After the data are transmitted onto the channels successfully, the receiver amplifies and samples the received bit stream. The clock recovery circuit restores the clock of transmitter by detecting the transition edge of received data. Eventually, the receiver gets back the correct data by sampling the center point of the received bit stream at each transition edge of the recovered clock. 14.
(29) Chapter 3 Phase-Locked Loop. 3.1 Introduction Phase-locked loop (PLL) is an analog building block used extensively in many analog, digital and communication systems. PLL causes a particular feedback system to track with another one. More precisely, a PLL is a circuit synchronizing an output signal with a reference or an input signal in frequency as well as in phase. It is undoubted to say that PLL has become an important building block in many electronic systems. This chapter will introduce the architecture of the PLL .It needs a reference input clock signal at 100 MHz, and then produces a output clock signal at 400 MHz. By adopting four differential stages in voltage controlled oscillator, it generates eight clock phases for the use of the multiplexer in transmitter and sampling clock phases for the samplers in receiver. In the following section, we consider the linear model, the noise and the stability. In order to design a PLL quickly, the design flow and the way to decide the loop parameter are described in the next section. In the end, we show the simulation results as an ending of this chapter.. 15.
(30) 3.2 Architecture of PLL A phase-locked loop (PLL) is basically an oscillator whose phase and frequency is locked to those of the input signal. This is done by using a negative feedback control loop, as shown in Fig. 3-1, which includes a phase/frequency detector (PFD), a charge pump circuit (CP), a loop filter (LF), a voltage controlled oscillator (VCO), and a frequency divider (divided by N). The PFD is used to compare the feedback signal (Fback) from the output signal of divider with the input reference signal (Ref), and generates the Up and Downb signal to the following charge pump circuit. Based on Up and Downb input signals, the charge pump begins to charge or discharge the loop filter to change the input control voltage (Vctrl) of the VCO which varies the frequency of the output signal (Clk). The loop filter is basically a low pass filter used to filter out the high frequency component coming from the PFD and charge pump. In this way, the frequency of the feedback signal can be adjusted to be the same with the reference signal through the feedback control loop. In steady state, the frequency of the output signal will be N-times of the input signal. Moreover, the input reference signal (Ref) and the feedback signal (Fback) are phase-aligned.. F ref. Phase/Frequency Detector. up. Charge Pump. down. Loop Filter Control voltage(Vctrl). F back Voltage Control Oscillator. Frequency Divider. Fig 3-1 Basic PLL Architecture. 16.
(31) 3.3 Circuit Implementation 3.3.1 Phase-Frequency Detector. The phase frequency detector (PFD) is a digital sequential circuit which employs a tri-state operation. It is triggered by the two positive clock edges of the reference (Ref) and the feedback signals. Fig 3-2 shows its behavior. If the reference clock leads the feedback clock, the UP signal will be set from low to high. This will in turn increase the frequency of the voltage controlled oscillator output signal. When the feedback signal’s rising edge arrives, the reset signal will be high to reset UP signal to the low. In contrast, if there reference clock lags the feedback clock, the Down signal will be set to high, until the reference signal triggers the reset signal. This Down signal, on the contrary, is used to decrease the frequency of the voltage controlled oscillator output signal.. Fig 3-2 State Machine of PFD. 17.
(32) This three-state operation has a linear range of ±2πradians and can act as both phase detector and frequency detector. This property will greatly enhance the locking-time. As shown in Fig. 3-3, the PFD could be implemented simply by two dynamic D flip-flops and one AND gate. The D flip-flop schematic is shown in Fig. 3-4.. Fig 3-3 PFD Implementation. Fig 3-4 TSPC D-FF in PFD 18.
(33) Ideally, the PFD should have the ability to distinguish any phase error between reference and feedback signals. In practical, when the phase error is too small, the reset signal is so fast that the following charge pump circuit will not be activated. This will result in dead zone region (undetectable phase difference range). A low precision PFD has a wide dead zone as shown in Fig. 3-5, which results in increased jitter. The dead zone is highly undesirable because it allows the VCO to accumulate as much random phase error as the phase difference with respect to the input while receiving no corrective feedback.. Detected Phase Error. Phase Difference. Fig 3-5 Dead Zone of PFD. Equal and short duration pulses at the UP and DOWN outputs of the PFD are needed for in-phase inputs in order to eliminate a dead zone region in the PFD as seen by the charge pump. The dead zone region could be eliminated by adding extra delay buffers in the reset path to ensure that when both reference and feedback signals are at the same phase, there would be equal and activated pulses at the output. The elimination of the dead zone region results in overall linear operating characteristics 19.
(34) for the PFD, especially for input signals with small but finite phase difference. However, inserting the delay buffers will limit the maximum operation frequency that is in inverse proportion to the total reset path delay. Fig 3-6 shows the SPICE simulation result of the proposed PFD circuit.. Fig 3-6 simulation result of PFD without dead zone. 3.3.2 Charge Pump The schematic of the charge pump circuit is shown in Fig 3-7. The two switch devices are separated from the output voltage. Therefore, the output voltage is now isolated from the switching noise resulting from the overlap capacitance of the two switch devices. In addition, the intermediate node between the current source and switch devices will charge to the output voltage only by the gate overdrive of the current source devices, Vgs – Vt, an amount independent of the output voltage. Moreover, since both the NMOS and PMOS current sources always turn on in each. 20.
(35) cycle, any charge injection will cancel out to first order with equal current source device sizes.. Fig 3-7 Schematic of the Charge Pump. 3.3.3 Loop Filter. The loop filter used in the charge pump PLL is shown in Fig 3-8. It has a lead-network consisting of a resistor R1 in series with capacitor C1 and a capacitor C2 in parallel. The lead-network filter provides a pole in the original to provide an infinite DC gain to get the zero static phase error, and a zero in the open loop response in order to improve the phase margin to ensure overall stability of the loop. The transfer function of the filter is given by. Where 21.
(36) Capacitance C2 is used to provide higher-order roll off for reducing the ripple noise to mitigate frequency jump. The total transfer function of the loop filter is. Where. But the adding of the capacitance C2 will make the overall PLL system become third-order one and affect the stability of the loop. In general, by setting C1>20×C2, the third-order can be approximated to second-order loop.. Fig 3-8 Loop Filter. 3.3.4 Voltage Controlled Oscillator. In order to have the low jitter characteristics of the output clock, the delay buffer used in voltage controlled oscillator (VCO) should have low sensitivity and high noise rejection capability of the supply and substrate voltage. The basic building block of the VCO used in this thesis is based on the differential delay stages with symmetric loads and replica-feedback biasing. The building blocks of the VCO include a four stages ring oscillator and a self-biased replica-feedback bias generator. Fig 3-9 shows 22.
(37) VCO delay cell.. Fig 3-9 Schematic of the four stages VCO and the delay cell. As shown in Fig. 3-9, the buffer stage contains a source-coupled pair with diode-connected PMOS devices as resistive loads in shunt with an equally sized PMOS device. The control voltage, Vbp, is the bias voltage for the PMOS device. It is also used to generate the bias voltage for the NMOS current source and provides the control over the delay of the buffer stage. In order to provide a bias current that is independent of the supply and substrate noise, the bias voltage of the NMOS current source, Vbn, will be continuously adjusted. Fig.3-10 shows the I-V characteristics of the symmetric load.. Fig 3-10 I-V curve of the symmetric load. 23.
(38) Basically, to get the high noise rejection capability over the supply and substrate noise, the load of the differential pair should have a linear I-V characteristic. In practice, this is difficult to use MOS device to achieve it. But the symmetric load can cancel the first order of the common mode voltage noise. Therefore, the symmetric load here, though nonlinear, could be used to have high dynamic supply noise immunity. The control voltage, Vbp, is the bias voltage for the PMOS device. In order to provide a bias current that is independent of the static supply noise, the bias voltage of the NMOS current source, Vbn, will be continuously adjusted. As the supply voltage changes, the drain voltage of the NMOS current source also changes. However, the gate bias is adjusted by the replica-feedback bias generator to keep the output current constant. It seems that it makes the output resistance of the NMOS current source higher. Hence the static supply noise is greatly improved. Based on the analysis of the I-V curve, it can be shown that the effective resistance of a symmetric load (Reff) is directly proportional to the small signal resistance at the ends of the swing range which is just one over the transconductance (gm) for one of the two equally sized PMOS biased at Vctrl. Therefore, the buffer delay is. td = Reff Ceff =. 1 Ceff gm. (3-1). where Ceff is the effective buffer output capacitance. The drain current for one of the two equally sized devices biased at Vctrl is. Id =. kp [(VDD − Vctrl ) − Vtp ]2 2. Taking derivative with respect to Vctrl, the transconductance gm is given by. 24. (3-2).
(39) g m = kp[(VDD − Vctrl ) − Vtp ]. (3-3). The buffer delay is then given by. f osc =. kp [(V DD − V ctrl ) − Vtp 1 = 2 Nt d 2 NC eff. ]. (3-4). The gain of the VCO is given by. K vco =. df osc − kp = dV ctrl 2 NC eff. (3-5). As a result, Kvco is independent of the buffer bias current and the VCO has first order tuning linearity.. td =. C eff. kp [(V DD − V ctrl. )−. Vtp. ]. (3-6). The bias generator of the VCO delay cell is shown in Fig 3-11. It provides the output bias voltage Vbp and Vbn from input signal Vctrl. The primary function is to continuously adjust the VCO delay buffer bias current to provide the correct lower swing limit Vctrl for the VCO delay buffer stages. As a result, it builds up a current that is held constant and independent of supply voltage. The bias generator consists of a PMOS source coupled differential pair, a half-buffer replica, and a control voltage buffer. The differential amplifier is actually a unity-gain buffer which forces the voltage of node Va in Fig 3-11 equal to Vctrl, a condition required for correct symmetric load swing limits, and provide the bias voltage Vbn for the NMOS current source. Besides, the bias voltage, Vbn,is dynamically adjusted by the differential amplifier to increase the supply noise immunity. With the half-buffer replica, the net result is that the output current of the NMOS current source is established by the load 25.
(40) element and is independent of the supply voltage. If the supply voltage changes, the amplifier will adjust to keep the swing and the bias current constant. Because the differential amplifier utilizes the self-biased architecture, there are two stable states, one of which is unbiased. As a result, an initial circuit is needed to bias the amplifier when power-up.. Fig 3-11 Schematic of self-biased replica-feedback bias generator. Because the differential amplifier and the half-buffer replica form a two-stage negative feedback loop, frequency response issue must be taken into consideration. Basically, there are two poles in the loop. One is at amplifier output, and the other is at the half-buffer replica output. Since the pole at the amplifier output is the dominant one, it can be moved toward origin to increase the phase margin of the loop by the capacitive load of the NMOS current source gates in the VCO buffer chain. Moreover, in order to track any supply and substrate noise that affect the VCO jitter performance, the bandwidth of the self-biased circuit is usually set equal to the operation frequency 26.
(41) of the VCO. The bias circuit also provides a buffered version of control voltage Vctrl using an extra control voltage buffer. This can isolate the control voltage Vctrl from capacitive coupling in the VCO buffer chain. The PLL used in this thesis needs to generate four phases for the transmitter multiplexer and for the receiver samplers. Therefore, the VCO uses four delay buffer stages with the output frequency at 400MHz. The transfer curve simulation result of the VCO is shown in Fig. 3-12. The supply voltage is 3.3V. For Vctrl between 0.8V to 2.4V, the gain of the VCO is 457MHz. 800 700 600 500 400 300 200 100 0. TT FF SS. 0. 0.5. 1. 1.5. 2. 2.5. 3. Fig 3-12 Transfer Curve of the VCO The differential oscillator output is converted to the 50% duty cycle single-ended signal. used. as. input. to. the. phase-frequency. detector. with. the. differential-to-single-ended converter shown in Fig. 3-13 and the feed forward type duty-cycle corrector shown in Fig. 3-14. The two differential amplifiers of the differential-to-single-ended converter use the same current source bias voltage, Vbn, generated by the self-biased replica-feedback bias generator for the VCO. According to Vbn, the circuit corrects the input common-mode voltage level and provides signal amplification.. 27.
(42) Fig 3-13 Schematic of differential-to-single-ended converter. Fig. 3-14 Schematic of feed forward type duty-cycle corrector and its timing diagram The duty-cycle corrector is connected behind the differential-to-single-ended converter to ensure that the duty-cycle of the VCO will be 50%. The signal P+ selected from the multiphase signals turn on M1 and M2, and charges the output node clk+ of the duty-cycle corrector almost instantaneously. Because the discharge path of 28.
(43) the node clk+ is already off due to the signal P-. The signal P-, which is also selected from the multiphase signals, is the one whose rising edge is shifted by 180° in phase from that of P+. Similarly, the signal P- rapidly discharges the node clk+ and delivers the desired 50% duty-cycle signal. Since this duty-cycle correction circuit consists of only two transmission gates and two inverters, the area is minimal and the power consumption is negligible. In order to drive next stages, digital buffers are added at the output to improve the driving ability.. 3.3.5 Divider Because the output frequency of the VCO is 400 MHz and the input reference frequency is 100MHz. Hence a divided-by-four circuit is used. The TSPC D Flip-Flop connected its inverted output to D input is used as a divided-by-two circuit, as shown in Fig. 3-15. In this circuit we need to check input clock driving capability to assure correct operation. Then, two divided-by-two circuits are cascaded to get a divided-by-four circuit. Unfortunately, asynchronous counter will accumulate jitter stage by stage. A synchronous counter is used at the last stage to re-sample the clock, and it will eliminate the jitter accumulated in asynchronous counter, as shown in Fig. 3-16.. 29.
(44) Fig. 3-15 Schematic of TSPC Asynchronous Divided-by-two circuit. Fig. 3-16 Divider composed of asynchronous and synchronous counters and its timing diagram. 30.
(45) 3.4 PLL Parameter Design Because the charge pump has switching characteristics, the PLL is generally a discrete-time domain operation. It is difficult to use continuous time-domain analysis. However, if under some condition, the s-domain model could also be used to get a thorough understanding of the negative feedback loop. Fig. 3-17 shows the linear model of the PLL.. θin. +. +. G(S) Ip 2π. F(s). Vc. 2π. Kvco S. θout. θback. 1 β(S) N Fig. 3-17 Linear Model of PLL. Assume the PLL is in lock state. The PFD and CP have a current change of Ip/2π (A/rad), the LF has a transfer function F(s) (V/A), the VCO has a gain of Kvco (Hz/v), and the feedback factor is 1/N. The conversion gain of the VCO should be changed to 2πKvco/s (rad/sec-V), because phase is the integral of the frequency. Based on the above definitions and PLL linear model, the open loop gain of the PLL can be represented as G ( s) × β ( s) =. θ back (s ) I P × Kvco × F ( s) = θ in (s ) s× N. 31. (3-7).
(46) The closed loop transfer function of the PLL is given by H ( s) =. θout (s ) G( s) N × G( s) N × K = = = θin (s ) 1 + G( s) × β ( s) N + G( s) s + K. (3-8). Therefore, the 3-dB bandwidth is I × Kvco × F ( s ) ω3dB = K = P N. (3-9). From analysis of LF, we know that the shunt capacitance C2 is typically much smaller than C1. Therefore, we can neglect the capacitor C2 and using classical two-pole system and second-order linear model of PLL to analyze the characteristic of transient response. With F(s) = R1 + (1/sC1), the closed loop transfer function can be derived as H (s) =. I P × K vco (1 + SR1C1 ) ⋅ I K R I K C1 S 2 + P vco 1 S + P vco N NC 1. (3-10). Equation above can be compared to the classical two-pole system transfer function 2ζ × ωn + ωn 2 H ( s) = 2 S + 2ζ × ωn × S + ωn 2. (3-11). Therefore, the natural frequency ωn, and damping factor ζ can be derived as ωn =. ζ =. I p K vco NC1. ωn 2ω z. (3-12). (3-13). In the case of the PLL design, the frequency noise of the VCO could be the dominant noise source to influence the phase noise performance. As will be seen in 32.
(47) later section, the noise of the VCO has the high pass characteristics. Therefore, a large loop bandwidth for the PLL feedback system is better because it can enhance the tracking ability. The choice of the damping factor ζ is a trade off between acquisition time and step response stability. If larger ζ is chosen, the system could have longer acquisition time. On the other hand, if smaller ζ is chosen, the system may be ringing for step response or become unstable. Then, we use the loop bandwidth and the phase margin to determine the component values of the loop filter. We can get Loop BW =. I P × Kvco R1C1 ⋅ N C1 + C 2. (3-14). The phase term will be determined based on the pole and zero of the loop filter such that the phase margin is calculated as PM = tan −1. BW. ωz. − tan −1. BW. ωp. (3-15). By setting the derivative of the phase margin equal to zero, the phase margin is maximum when the loop bandwidth is set to the average of pole and zero. BW = ω zω p. (3-16). We can define a new parameter, γ, as. γ =. BW. ωz. =. ωp BW. (3-17). The capacitance ration of C1 and C2 can be represented by C1 = γ 2 −1 C2. The loop bandwidth (BW) now can be written as 33. (3-18).
(48) BW =. I p × K VCO N. ⎛ 1 ⋅ R1 ⎜⎜1 − 2 ⎝ γ. ⎞ ⎟⎟ ⎠. (3-19). The design flow of a third-order PLL can be derived from equations above. The design flow can be summarized as follows: 1.. Determine Kvco by measuring VCO test keys or simulating a VCO using in the design or referring to the data sheets of the employed commercial VCO.. 2.. Depending on the desired noise and transient performance, determine the loop bandwidth BW. Usually, the BW is less than 1/10 of the reference clock.. 3.. If the filter is off-chip, set Ip to be around 100μA to 1mA. If an on-chip filter is employed, decrease the value of Ip so that the reasonable trade off between chip area and charge pump current could be reached.. 4.. Determine the nominal value of N according to the system to be applied to.. 5.. Selecting the required PM specification.. 6.. With BW, Ip, PM, N, and Kvco determined, R1 can be calculated.. 7.. Calculate the value of C1 with C1=1/R1ωz.. 8.. Calculate the value of C2.. The parameters used in the PLL are listed in Table. 3-1. Fig.3-18 shows the curve for the open loop PLL frequency response. This curve gives the phade margin of approximately 70°. Fig. 3-19 shows the eight even-spaced phases of frequency 400MHz. Fig. 3-20 shows the simulation of the eight output clock signals of the PLL.. 34.
(49) Technology. TSMC 0.35µm 2P4M CMOS. Function Supply voltage Input Frequency Output Frequency. PLL 3.3v 100 MHz 400 MHz. Charge Pump Current(Icp) Loop Filter -C1 -R1 -C2 VCOgain(Kvco) Divider (N). 100 µA 59.86 pF 3 kO 1.92 pF 430 MHz/V 4. Loop Bandwidth Phase Margin Power. 5 MHz 70 degrees 23mW@400MHz. Table 3-1 Parameters of the PLL. 35.
(50) Fig. 3-18 Open Loop gain simulation of the PLL. Fig. 3-19 Simulation of the four output clock signals of the PLL. 36.
(51) Fig. 3-20 Simulation of the eight output clock signals of the PLL. 3.5 PLL Noise Analysis and Stability The timing jitter could affect the maximum timing margin of the transmitter and the performance of the high speed serial link. The output jitter of the PLL is contributed by many different noise sources as shown in Fig. 3-21, where θin(s) is the reference noise, in(s) is the PFD and CP noise, Vn(s) is the LF noise and θn(s) is the VCO noise.. Fig. 3-21 Linear model of PLL with different noise sources 37.
(52) These noises introduce the phase fluctuations or timing jitter in time domain. Using closed loop analysis, the transfer functions with different noise sources can be derived as H (s) =. θ out ( s ) N × K = θ in ( s ) s + K. H pfd (s) =. θ out (s) N K = 2π ⋅ ⋅ θ pfd (s) IP s + K. H lf ( s ) =. K θ out ( s ) = 2π ⋅ vco θ lf ( s ) s+K. Hvco(s) =. θout(s) s H(s) = =1θvco(s) s + K N. where K=. Ip × Kvco × F ( s ) IpKvco 1 + sR1C1 (When C2 is neglected) = × N N sC1. The noise transfer functions have different characteristics. The Hin(s) and Hpdf_cp(s) are low pass functions, the HLF(s) is a band pass function and the Hvco(s) is a high pass function. Based on the analysis, the loop bandwidth of the PLL should be maximized to meet the high pass function of the VCO to filter the timing jitter caused by the VCO. The maximum nature frequency ωn of the PLL is restricted to the input reference clock frequency ωin. Using the analysis from the PLL, the criteria of the stability limit can be derived as. ω in 2 ωn < π ( R1C1ω in + π ) 2. As a rule of thumb, stability can be assumed by keeping ωn < 1/10 ωin. Choosing larger loop bandwidth indicates that more phase noise from the input clock will transfer to the output with larger loop bandwidth. However, it does not cause a problem when the input is a clean clock source.. 38.
(53) Chapter 4 Transmitter. 4.1 Architecture of Transmitter 4 PRBS. Data process. 4. PLL. 4:1 MUX. Clk Process delay. Data Driver. TxD+. MUX delay. 100MHZ Clk Process. Clk Driver. TxC+. TxD-. TxC-. Fig. 4-1. Block diagram of type 1Transmitter. 4 PRBS. Data process. PLL. 4. 4:1 MUX. Clk Process delay. Data Driver. TxD+. MUX delay. 800MHZ Clk Process. Clk Driver. TxC+. Fig. 4-2. Block diagram of type 2Transmitter 39. TxD-. TxC-.
(54) The data input is from PRBS (Pseudo Random bit sequence). The data process circuit pre-skew the data before feeding them into the multiplexer. The pre-skew of parallel data are shown in Fig. 4-3. Fig. 4-1 and Fig. 4-2 show the block diagrams of the transmitter architecture with type 1 and type2. The differences between type 1 and type 2 are clock process circuit and clock process delay circuit. Type 1 transmitter transfer 100MHz clock as Fig. 4-4 for receiver. Type 2 transmitter transfer 800MHz clock as Fig. 4-5 for receiver.. D0. D0. D1. D1. D2. D2. D3. D3. 2.5 ns. 0.625 ns. 2.5 ns. t. t. Fig. 4-3 pre-skew of parallel data. 100 MHz Clk D0. D1. D2. D0. D3. D1. D2. Fig 4-4 Type 1 clock (100 MHz) 40. D3. D0.
(55) Clk. Data. 800 Mhz. D0. D1. D2. D3. D0. D1. D2. D3. D0. Fig 4-5 Type 2 clock (800 MHz). The transmitter is built up by a PRBS circuit, a PLL, a 4 to 1 multiplexer, clock process circuit, and data and clock circuit. The transmitter consists of a PLL proposed in the chapter 3 to produce the clock signals at 400 MHz with eight even-spaced phases. By using 4:1 input-multiplexer, we can serializes low-speed four channels parallel data on four even-spaced phases of 400MHz which gives a bit rate 1.6Gbps, and we can reduce the frequency requirement of the timing circuits and the digital logic. Only four even-spaced phase is utilized for 4:1 MUX. The other is utilized for transferring 800MHz clock and using data pre-skew. For testing, the Pseudo Random Bit Sequence (PRBS) is utilized to generate data pattern. Through the data and clock driver, the data stream is transmitted out with a nominal swing of 200mV. In the following section, we will describe the detail circuits of the function blocks in the transmitter architecture.. 41.
(56) 4.2 Pseudo Random Bit Sequence (PRBS). Fig. 4-6 block diagram of Pseudo Random Bit Sequence (PRBS). Fig. 4-7 PRBS delay cell circuit. As shown in Fig.4-6, The Pseudo random bit sequence (PRBS) is widely used for testing communication systems. Fig. 4-7 shows the circuit implementation of the D-flip flop delay cell used in the PRBS circuit. With a series delay cell, each delay cell can offer a signal for next delay cell. The output of the XOR gate can generate the new data. The pattern repeats every 27 -1=127 clock cycles. We also note that if the 42.
(57) initial condition is zero, the delay cells remain in a degenerate state. Therefore, the SET signal must be used to solve this problem. And the XOR logic is the speed-critical part in the circuit. Then, we can use the outputs as 4-parallel data inputs of transmitter.. 4.3 Four-to-One Multiplexer. Fig. 4-8 Timing diagram of 4:1 multiplexer. 43.
(58) The multiplexer is used to serialize the parallel data channels D0~D3.When the transmitter transfers the data stream with 1.6Gbps, the PLL must produce four-phases with 400MHz. It generates the required phases of clk0, clk2, clk4, and clk6. The other phases of clock are utilized to generate 800M Hz clock for TYPE 2 transmitter. The relationship between input data, D0~D3, and clock (clk0, clk2, clk4, andclk6) is shown in Fig.4-8. For example, at the timing interval between the rising edge of clk0 and the falling edge of clk6, the input signal D0 starts driving the multiplexer output. In order to achieve this algorithm, the multiplexer, as shown in Fig.4-9, is used to serialize the parallel eight data channel input D0~D3. High multiplexer fan-in may become the bottleneck and the achievable speed gradually decreases. This speed limitation is not an inherent property of the process technology but of the circuit topology. Then 2-1 MUX is utilized, such as Fig. 4-10 and Fig. 4-11. The Mux delay buffer is introduced in section 4.3.2.. D0. clk0 clk4. D1. D2. 2-1 MUX. clk2. MUX delay. clk6. MUX delay. clk4 clk0. 2-1 MUX. Fig.4-9 Block Diagram of 4-1 Multiplexer. 44. D3. 2-1 MUX.
(59) In1 In1b. clk. In2 In2b. 2-1 MUX. clkb. Fig. 4-10 Block of 2-1 MUX. In1. clk. clkb out. In2b. In1b. In1b. clk. clkb outb. In1. Fig. 4-11 Schematic of 2-1 MUX. 45. In2. In2b. In2.
(60) 4.3.1 Data Pre- skew to 4:1 Multiplexer. Fig. 4-12 Timing diagram of Pre-skew. In order to ensure that each multiplexer of first level can select input data at the stable and correct state, the pre-skew parallel data channel D0~D3 is utilized for the multiplexer. If the transient edges of clock and input data rise approximately at the same time, the selected data is confused and costs some time to be stable. Thus, the output data jitter of the transmitter. Fig. 4-12 shows the timing diagram of pre-skew. In order to achieve the target, some input data must be shifted before given in 2-level multiplexer. 46.
(61) 4.3.2 Mux Delay As showed in Fig. 4-1 and Fig. 4-2, clocks are transferred with data pattern. The delay of data pattern and the clock must be the same and hence the clock is reliable. With the same circuit architecture, the delay of these two circuits are the same. Then the path of clock is designed as long as the path of data pattern. The multiplexer is used to serialize the parallel data channels D0~D3. Data is passed through 2-levels type MUX, and hence two stage of MUX delay buffer is added in clock path, as Fig. 4-13.. clk. vdd. clkb clkb. vdd. clk. outb. out clk. clkb. Fig. 4-13 schematic of MUX delay. 4.4. Clock Process Circuit. Because code modulation is usually used for data pattern, we usually don’t need such high speed to match spectrum for channel. In TYPE 1 transmitter, 100 MHz clock is transferred, as Fig. 4-4. For a critical case, a 1.6Gbps data pattern transfers a one followed by a zero (0 1 0 1 0 1 . . . . ), and it is equal to 800MHZ clock actually. Then 800MHz is the fastest clock to transfer 1.6Gbps data pattern. In TYPE 2 transmitter, 800 MHz clock is transferred, as Fig. 4-5.. 47.
(62) TYPE 1 Transmitter:. In TYPE 1 transmitter, 100MHz clock is utilized to give information about phase between clock and data, as Fig. 4-4. Clk0 in Fig.4-8 is used to generate 100Mhz clock for receiver. Because clk0 is 400MHz and the clock for receiver is 100MHz. Hence a divided-by-four circuit is used. The TSPC D Flip-Flop connected its inverted output to D input is used as a divided-by-two circuit, as shown in Fig. 3-15. Two divided-by-two circuits are cascaded to get a divided-by-four circuit. A synchronous counter is used at the last stage to re-sample the clock, and it will eliminate the jitter accumulated in asynchronous counter, as shown in Fig. 3-16. Because clock passed through a divided-by-four circuit suffers delay. In order to ensure the phase between data and the clock is correct. A delay is added to data path. Fig.4-1 shows the data path which includes clock process delay. Fig. 4-14 shows the. TSPC D Flip-Flop and its delay cell.. Fig. 4-14 TSPC delay cell. 48.
(63) TYPE 2 Transmitter:. In TYPE 2 transmitter, the edge of 800MHz clock edge will be located at the midpoint of each bit by clock process. It needs clk1 and clk3 in Fig. 4-8 to generate 800MHz clock for receiver by using XOR, as Fig. 4-15. The receiver overcomes device limitations by using both rising and falling clock edges, as shown in Fig. 4-5. The clock is able to operate at half the speed of the data rate. 800MHz Clk. Clk1 Clk3. Output Data Stream [0:3] V. Clk1 Clk3 V. 800MHz Clk D0. V. D1. Fig. 4-15. V. D3. D2. Timing diagram of clk and data. Because clock which pass through a XOR gate suffers delay. In order to ensure the phase between data and the clock is correct. A delay is added to data path. Fig.4-2 shows the data path which includes clock process delay. A XOR gate is added to data path for being clock process delay.. 4.5 Data driver The basic receiver has high DC input impedance, the majority of driver current flows across the termination resistor generating about 200mV across the receiver inputs. The simplified RSDS outputs consist of a current source which drives the. 49.
(64) differential pair line. When the driver switches, it changes the direction of current flow across the resistor, hence creating a valid “one” or “zero” logic state. A differential load resistor at the receiver end provides current-to-voltage conversion and optimum line matching at the same time. An additional termination resistor is usually placed at the source end to suppress reflected waves caused by crosstalk or by imperfect termination. The implemented transmitter data driver shown in Fig.4-16 uses the typical configurations with four MOS switches in bridge configuration. In order to obtain the correct output offset voltage of the RSDSTM Spec, a feedback loop across a replica of the transmitter circuit is used, but in this case the effect of component mismatches between the transmitters and replica should be carefully taken into account.. Fig.4-16 (a) Schematic diagram of the RSDS transmitter data driver. (b) Common mode feedback circuit. Fig.4-16(b) shows that a simple low-power common-mode feedback control was. implemented in the transmitter to achieve higher precision and lower circuit complexity. The common-mode output voltage is sensed by means of a high resistive 50.
(65) divider RA and RB (=50kΩ) and compared with a 1.25V reference by the differential amplifier. The fraction of the tail current Iout flowing across M1 and M2 is mirrored to MU and ML, respectively, thus forcing VCM≈1.25V. Usually, the large gain of device size MU and ML is used in order to make negligible the power consumption of the common-feedback circuit. To develop the correct voltage swing on the 50Ω load resistance (RT_T//RT_R), the amount of current should be designed properly.. 4.6 Transmitter Simulation Result 4.6.1 Simulation Environment 5nH PCB MODEL. 5nH. 10 pF. 100 ohms. 3pF. 100 ohms. + RSDS TX -. PCB MODEL 10 pF. 3pF. Fig. 4-17 testing environment on board. In real IC, the DIE will be packaged and we should take it into consideration. After transmitted from the transmitter data driver, the data output goes through the internal bonding pad, external bonding wire and the PCB circuit. The thin bonding wire can be inductive and the pad is inductive and capacitive. Finally, the output signal arrives at the receiver termination resistor RT_R. During simulation the package effects are added in vdd, gnd, and I/O node. Besides, the output loading of the data driver should be considered. The simulation environment is implemented as shown in Fig. 4-17. In the following sections, the simulation results are respectively. demonstrated.. 51.
(66) 4.6.2 Simulation Result of PLL Fig. 4-18 shows the waveform of eight-phase clock signal with 400MHz clock. of the PLL. Fig. 4-19 shows the eye-diagram of the clock by the PLL. The jitter is about 33 ps.. Fig.4-18 eight-phase VCO clock of PLL. Fig.4-19 eye diagram of the transmitter output waveform. 52.
(67) 4.6.3 Simulation Result of Transmitter Without Data Pre-skew Fig. 4-20 shows the simulation result with 2-levels multiplexer. The width of the eye-diagram is about 538 ps with 87 ps jiter.. 538 ps. Fig.4-20 eye diagram of output (Without data preskew). With Data Pre-skew. With pre-skew circuit, we can avoid the condition that the clock edge falls on the data transient state. This makes eye-diagram more open. Fig.4-20 shows the simulation result. The amplitude of data eye-diagram is increased to about 200 mV and the width of the eye-diagram is about 547 ps with 78 ps jitter. Fig.4-21 shows the waveform of proposed transmitter outputs.. 547 ps. ± 200mv. Fig.4-21 eye diagram of output (Without data preskew). 53.
(68) Fig.4-21 Simulation result of the transmitter output waveform. 54.
(69) Chapter 5 Receiver. 5.1 Architecture of Receiver This chapter presents the receiver design. Fig. 5-1 and Fig. 5-2 show the block diagrams of the receiver architecture with Type 1 and Type2. The purpose of the receiver is to recovery the received signal to the original data by amplifying and sampling the signal. Then, the de-multiplexer makes recovered serial data become four parallel data. Fig. 5-1 and Fig. 5-2 show the block diagrams of the receiver architecture with. Type 1 and Type2. Type 1 receiver receives 100MHz clock as Fig. 4-4 for receiver. Type 2 receiver receives 800MHz clock as Fig. 4-5 for receiver. The Type 1 receiver consists of a PLL proposed in the chapter 3 to produce the clock signals at 400 MHz with eight even-spaced phases. By using 4:1 de-multiplexer to parallelize a 1.6Gbps data into low-speed four channels parallel data, we can reduce the frequency requirement of the timing circuits and the digital logic. Only four even-spaced phase is utilized for 4:1 MUX. The other is utilized for transferring 800MHz clock.. 55.
(70) TxD+. 1:4 DEMUX. SLICER. TxD1.6 Gbps Data. Retimed Data. Divider Delay 400 MHz Clk. 100 MHz Clk TxC+ SLICER TxC-. PLL. Fig. 5-1 Block diagram of Type 1 receiver. TxD+. 1:4 DEMUX. SLICER. TxD1.6 Gbps Data 800 MHz Clk TxC+ SLICER TxC-. Fig. 5-2 Block diagram of Type 2 receiver. 56. Retimed Data.
(71) 5.2 Slicer Fig. 5-3 shows the schematic of the slicer. The differential data will be distorted. because of the inductance and capacitance resonance caused by bonding wire and pad when they enter the receiver chip. It plays a key role to sense received signals, either from system clocks or input data stream, therefore input sensitivity, symmetry and bandwidth are major concerns. It is an open-loop comparator in the receiver circuit. To meet the common mode voltage range, the circuit is implemented with PMOS input differential pairs with a constant current source and using NMOS crossed-coupled pairs as the load.. Fig. 5-3 Schematic of slicer. The gain and bandwidth of the slicer should be carefully designed to meet the requirement, because the slicer needs to be able to detect the received signals that were noisy and swing limited and amplify the signal to get the nearly full swing CMOS level at the output. Moreover, the offset voltage of the slicer also affects the 57.
(72) correct operation of the receiver. The offset voltage is not only due to the mismatches in the input devices but also mismatches (both device and capacitance mismatch) within the positive-feedback structure. These errors are referred back to the input as the input-offset voltage. Fig. 5-4 is the frequency response of the slicer. Fig. 5-5 shows the hysteresis. window of the slicer. The advantage of this hysteresis comparator is noise immunity. The data or clock stream sends to the following PLL or demux to get the data value.. Fig. 5-4 Frequency response of slicer. Fig. 5-5 Hysteresis window of the slicer 58.
(73) 5.3 Circuit Implementation 5.3.1 Type 1 receiver Fig. 5-1 shows the block diagram of Type 1 receiver. When the Type 1 receiver. receives the 100MHZ clock by Type 1 transmitter proposed in chapter 4, the PLL proposed in Chapter 3 must produce four-phases with 400MHz. It generates the required phases of clk0, clk1, clk2, and clk3. The relationship between output data, D0~D3, and clock (clk0, clk1, clk2, andclk3) is shown in Fig. 5-7. For example, at the timing interval between the rising edge of clk0 and the falling edge of clk3, the input signal D0 starts driving the de-multiplexer output.. F ref. Phase/Frequency Detector. up down. Charge Pump. Loop Filter Control voltage(Vctrl). F back Voltage Control Oscillator. Frequency Divider. Fig 5-6 Basic PLL Architecture. 59.
(74) Fig. 5-7 Timing diagram of data stream and clock. As showed in Fig. 5-6, the clock that VCO generates passed through the divided-by-four divider compares with the reference clock Fref. Hence the phases of the clocks leads the phase of the data. Because the Fref (100MHz clock) form transmitter is utilized to give information about phase between clock and data. In order to ensure the phase between data and the clock is correct. A delay is added to clock path. Fig.5-1 shows the clock path which includes delay circuit. Fig. 4-14 shows TSPC D- flip-flop circuit and its delay cell. In order to achieve this algorithm, the de-multiplexer, as shown in Fig. 5-8, is used to parallelize the serial data stream with 1.6Gbps into four parallel data channels D0~D3. High de-multiplexer fan-out may become the bottleneck and the achievable speed gradually decreases.. 60.
相關文件
To facilitate data collection and input, this Bureau introduced an e-questionnaire for all local ordinary secondary day schools to report information on their
To facilitate data collection and input, this Bureau introduced an e-questionnaire for all local ordinary secondary day schools to report information on their
Following the supply by the school of a copy of personal data in compliance with a data access request, the requestor is entitled to ask for correction of the personal data
what is the most sophisticated machine learning model for (my precious big) data. • myth: my big data work best with most
important to not just have intuition (building), but know definition (building block).. More on
For terminating simulations, the initial conditions can affect the output performance measure, so the simulations should be initialized appropriately. Example: Want to
Discovering the City by Mining Diverse and Multimodal Data Streams – IBM Grand Challenge: New York City 360. § Exploring and Integrating Multiple Contents and Sources for
The remaining positions contain //the rest of the original array elements //the rest of the original array elements.