• 沒有找到結果。

Chapter 4 0.4V Fully Integrated Process Invariant Temperature Sensor

4.4 Simulation and Experimental Results

To verify effectiveness and capabilities of the proposed temperature sensor with enhanced process variation immunity, it was designed by full-custom EDA tools and fabricated in a TSMC general purpose 65-nm one-poly ten-metal (1P10M) CMOS process. Also, the impact of process/voltage variations on the proposed temperature sensor is evaluated in this section. The area of the proposed sensor core is only 55μm ×

71

18μm without I/O pads as shown in Figure 4.6. The proposed process invariant temperature sensor is composed of a near-threshold ring oscillator, a sub-threshold ring oscillator, a fixed pulse width generator, counters, and a control unit. The double guard ring surround the near-threshold ring oscillator and sub-threshold ring oscillator to prevent other circuit from interference but increase area slightly. Figure 4.5(a) shows digital output TS[10:0] remains almost the same across corners in post-layout simulation. The measurement error over 0˚C~ 100˚C is within -2.8˚C~ +3.0˚C as shown in Figure 4.5(b), which demonstrates good process immunity for the proposed sensor. The effective resolutions for all test chips spread over 0.25˚C.

600

Figure 4.5 (a) Digital output of sensor in post-layout simulation. (b) Simulated output error for 0˚C~100˚C.

72

940μm

940μm

55μm

18μm

Fixed Pulse Width Generator

Near-th. RO

Sub-th. RO

Control Unit

Counters

Figure 4.6 Microphotograph of proposed process invariant temperature sensor

73

The proposed sensor shared I/O pads with other designs within the 0.94mm × 0.94mm chip. For measuring convenience, we design PCB board as shown in Figure 4.7. Several regulator circuits are set for filter bouncing noise. Besides, there are a jumper and a switch for selection between DC-DC converter and temperature. The SMA terminal is utilized to receive START signal, because sample frequency is higher than normal condition.

Figure 4.7 PCB board design.

The measurement environment was set up as shown in Figure 4.8. Before measuring each test chip, the temperature of the programmable temperature and

74

humidity chamber EZ040- 72001 was set to 0˚C first and one hour was waited for the chamber temperature to be stable. For 0˚C measurement, CLK signal was generated by pulse/function generator 8116A for the control unit of the test chip. Meanwhile, START signal was issued to reset the test chip and activate the proposed sensor conversion.

After the counters of the test chip complete one operation, RDY signal will be inserted by the control unit of the test chip. The 11-bit digital output TS signals were then recorded by logic analyzer 16900A. It is worth noticing that the test chips were not firmly packaged and the bare die could be seen as shown in Figure 4.9. Such setting can help stabilize the core temperature of the test chips during measurement. The measurement of the proposed sensor was done in 5˚C steps over 0˚C~100˚C temperature range. A 0.5˚C/min heating slope was set to increase chamber temperature smoothly. Each temperature measurement was recorded after holding desired temperature point for 10 minutes.

Test Chip

Digital Oscilloscope Function Generator

Pattern Generator

Programmable Power Supply

Logic Analyzer Temperature

Controller CLK

START

Programmable Temperature & Humidity

Chamber

VDD 4

RDY

TS 11

Figure 4.8 Measurement environment for the test chips.

75

Figure 4.9 Bare die of the test chip on PCB board.

The supply voltage for the test chips is 0.4V. The measurement errors are -1.81˚C~+1.52˚C for 12 test chips after one-point calibration, as shown in Figure 4.10.

To ease chip realization, one-point calibration was fulfilled offline by linear curve fitting with the digital outputs of 80˚C. The corresponding 3σ inaccuracy is -2.79˚C~+2.78˚C. The average effective resolution of the test chips is measured to be 0.49˚C/LSB. The average power consumption is 520nW at 0.4V supply voltage and 45k samples/sec conversion rate. The measurement results of 12 test chips are shown in Figure 4.11 having an excellent linearity. Also, the ability of the proposed sensor suppressing the effect of process variation is demonstrated. To reveal the effect of voltage variation, the corresponding measurement errors are depicted in Figure 4.12 for 0.36V~0.44V (10% supply voltage variation). The inaccuracy of temperature measurement under voltage variation is within -6˚C~+8˚C.

76

-4 -3 -2 -1 0 1 2 3 4

0 10 20 30 40 50 60 70 80 90 100 Temperature (°C)

Error (°C)

Figure 4.10 Measured error curves for 12 test chips.

700 800 900 1000 1100 1200 1300

0 25 50 75 100

Temperature

11-bit Digital Output

w/o Near-th TSRO

@ FF corner

w/o Near-th TSRO

@ SS corner

Figure 4.11 Measured result curves for 12 test chips.

77

Figure 4.12 Measurement error curves for voltage variations.

In Table 4.1, the achieved performance of proposed ultra-low voltage process invariant frequency-domain temperature sensor is compared with recent temperature sensors. The ultra-low voltage operation ability of the proposed sensor achieves extreme low power consumption per conversion rate of only 11.6pJ/sample.

Table 4.1 Performance Comparison of Recent Temperature Sensors.

Sensor Technology Power Conv. Rate

78

4.5 Summary

A process invariant frequency-domain temperature sensor has been presented to enable on-chip temperature measurement. The sensor was designed to achieve ultra-low voltage operation. It composed of two temperature sensitive ring oscillators (TSROs). One was operated in near-threshold region (Near-TSRO) for the clock source of the proposed fixed pulse width generator. The other one was operated in sub-threshold region (SB-TSRO) for the clock source of the digital output counter.

With a 2-input AND circuit, the digital output of the proposed temperature sensor was proportional to the ratio of the SB-TSRO frequency to the Near-TSRO frequency, fo2/fo1. According to the different conduction current in near- /sub-threshold region, the effect of process variation on the proposed sensor could be greatly suppressed.

Meanwhile, the relationship between temperature and fo2/fo1 was linearly positive related.

The realization in TSMC general purpose 65nm CMOS technology meets the target to be capable of 0.4V supply voltage operation over the temperature range of 0˚C to 100˚C. The area of the sensor core (without I/O pads) is only 990μm2. The power consumption per conversion rate is 11.6pJ/sample, which is a hundredfold improvement over previous work [4.4], [4.6]. All these characteristics make the proposed sensor special applicable for energy-limited miniature portable platforms.

79

Chapter 5

Temperature-Aware DRAM Refresh Controller in TSV 3D-IC

5.1 Introduction

Though-silicon-via (TSV) has emerged as a promising solution in building 3D stacked devices. It is a technology where vertical interconnects formed through the wafer to enable communication among the stacked chips [5.1], [5.2]. There are also other wafer level processing technologies to form 3D structures including the single-crystal Si layer stacking method [5.3], [5.4]. TSV technology is believed to have the potential to open up many new horizons in the semiconductor industry in the near future. This is because it provides many benefits including high density, high band-width, low-power, and small form-factor [5.5], [5.6]. Also, as we near the limit of technology scaling, it is believed to be a promising solution to overcome the scaling limit.

Another possible application is ―logic+memory‖ combination, where a single or multiple memories are directly stacked on top of a logic chip [5.7], [5.8]. Here, the logic chip and the memory can communicate through thousands of IOs allowing high-bandwidth with low power. Also heterogeneous integration circuits and 3D logic chip applications are expected to emerge in the future. In the former application, TSVs

80

are used to interconnect logic, memory, analog, RF sensor and MEMS chips among others. In the latter one, a logic chip itself such as CPU, can be built 3-dimensionally [5.9]. Figure 5.1 is a conceptual schematic of a hyper-integrated 3D-IC combined with a contemporary flip chip package and heat sink technology.

Figure 5.1 3D circuit architecture connected to a conventional heat removal device [5.16].

However, for multi-level 3D-IC, high level of integration introduces the problem of thermal and self-heating, which is the result of increased power density. Although the power consumption of a die within a 3D-IC is expected to decrease due to the shorter interconnects, the heat removing of a 3D-IC is much more difficult than that of a 2D-IC. The cause is that the ambient environment of the die of a 2D-IC is the cooling material, but the ambient environment of a die within a 3D-IC may be another die which also generates heat. Therefore, the thermal issue of a 3D-IC is much severer than that of a 2D-IC. This feature makes the circuits in 3D-IC must operate adaptively according to the thermal condition of each layer.

This chapter proposes a temperature-aware refresh controller of the dynamic random access memory (DRAM) in intra layer of 3D-ICs. Also, previous works of

81

DRAM refresh mechanisms are discussed. To analyze the data retention time accurately, a 1Kb DRAM block is build up with TSMS 65nm CMOS process. Besides, a process invariant frequency-domain temperature sensor proposed in chapter 4 is utilized to measure DRAM block temperature and control the refresh frequency adaptively for DRAM thermal monitor and power consumption control.

The rest of this chapter is organized as follows. The thermal issues and solutions in 3D DRAM will be discussed in section 5.2. In section 5.3, System architecture of heterogeneous 3D Integration is build up. Next, temperature-aware refresh controller of DRAM layer in 3D-IC will be proposed in section 5.4. Simulation results of proposed architecture are given in section 5.5. Finally, section 5.6 concludes this chapter.

5.2 Thermal Issues and Solutions in 3D-IC and DRAM

Refresh

5.2.1 Thermal Issues in 3D-IC

To study the thermal impact of hot spot size and power density on 3D stack design, thermal finite element simulations were performed in [5.10]. Two simulation setups have been used. The fine grain simulation of [5.11] takes into account the complete back-end -of-line (BEOL) and layout structure whereas in the FEM simulation of [5.12]

simplified models are using volume-averaged material properties. These finite element simulations have been calibrated with a test structure that consists of heaters integrated

82

with thermal sensors (diodes). Heaters with a size of 50×50 μm2 and 100×100μm2 are located in the metal 2 layer of the BEOL in the top tier of the 3D chip-stack, as well as in a 2D reference die. Both in the top and the bottom die of the stack, a set of five diodes at different distances from the hot spot centre are added are integrated below the heater.

This configuration of diodes allows capturing the local temperature peak due to the hot spot power dissipation. The simulation results and experimental validation [5.13]

(Figure 5.2) indicate that power dissipation in a 3D stacked structure approximately has a higher maximum temperature increase compared to the 2D reference case, requiring thermal-aware floor-planning to avoid thermal problems in the stack.

Figure 5.2 Temperature increase on the top die in a 3D chip-stack caused by a 100×100μm2 hot spot is approximately three times higher than the temperature increase

in a 2D SoC chip [5.10].

To implement the thermal-aware floor-planning in 3D stacks, a thermal compact model has been developed [5.14]. With this model, the temperature distribution is calculated in each die, using the power maps of the heat generation in each tier as input.

This compact model allows studying the thermal interaction of heat sources in the 3D stack, both on the same die as well as on other levels of the stack. Furthermore, the

83

compact model allows thermal optimization of the placement of the heat sources as a function of the geometrical and material properties of the interface and interconnects structures. Figure 5.3 shows the graphical interface of this thermal compact model.

Figure 5.3 Graphical interface of the thermal compact model for 3D stacked structures [5.10].

5.2.2 The 3D-IC with Interlayer Cooling

In CMOSAIC [5.15], a multi-disciplinary team will jointly conduct experimental research, develop the necessary modeling tools, simulate 3D-IC stacks and test various prototype stacks to develop practical methods for heat removal in high performance 3D-ICs.

84

Figure 5.4 depicts a simplified schematic diagram of a 3D-IC with the chips assembled on top of each other and with vertical TSVs between layers. Microchannel cooling elements are etched into the lower face of each chip to remove the heat dissipated locally by each chip. Two different types of coolants will be evaluated for heat removal: a single-phase water based nano-fluid and an environmentally friendly, two-phase evaporating refrigerant. The temperatures within the 3D-IC system have to remain below 90°C during operation to avoid damage to the chip. The objective of the coolant is to maintain the chip‘s temperature at or below this value while dissipating heat fluxes per layer up to 100-150 W/cm2 and targeting an inlet coolant temperature of 30-40°C.

Figure 5.4 Scheme of 3D-IC stack with microchannel [5.15].

Figure 5.5 summarizes the overall objective: To build a 3D-IC chip having more than three high power-density logic layers with channels etched on the backside of the chips in between the TSV that provides very large heat transfer coefficients for removal of 100-150 W/cm2 per layer in between 15x15 mm2 chips. The 3D-IC is embedded in a silicon case that provides the manifold structure for fluid input and output and that also allows external contact to a carrier using conventional C4 flip chip bonding.

85

Challenges to build such a system are huge and diverse, requiring development of the TSV etching and plating processes, the channel etching processes, the bonding processes between the layers, the sealing methods, the development of single-phase and two-phase compatible channel network designs, the integration of the chip stacks into a sealed case, the connection to the carrier, and a fluid delivery system.

Figure 5.5 3D-IC with TSVs and inter-layer cooling channels that is enclosed in a sealed manifold [5.15].

On the other hand, analysis is performed to simulate 3D IC cooling performance with microchannels fabricated between two silicon layers using deep reactive ion etching and wafer bonding techniques [5.16]. Figure 5.6 illustrates four different 3D stack schemes for a given flow direction. To simulate nonuniform power distributions in practical 3D ICs, the device is divided into logic circuitry and memory, where 90% of the total power is dissipated from the logic and 10% from the memory. This work assumes that heat generation represents the power dissipation comes from the junctions and interconnect Joule heating. For case (a), the logic circuit occupies the whole device layer 1, while the memory is on the device layer 2. In the other cases, each layer is equally divided into memory and logic circuitry. For case (b), a high heat generation

86

area is located near the inlet of the channels, while it is near the exit of channels for case (c). Case (d) has a combined thermal condition in which layer 1 has high heat flux and layer 2 has low heat dissipation near the inlet. The total circuit area is 4 cm2, while the total power generation is 150 W.

Figure 5.6 Two-layer 3D circuit layouts for evaluating the performance of micro -channel cooling. The areas occupied by memory and logic are the same and the logic

dissipates 90% of the total power consumption [5.16].

Figure 5.7 compares the thermal performance of the microchannels and conventional heat sinks and plots the predicted junction temperature distributions along the flow direction. In case of Figure 5.7 (a), the heat generation from each layer is uniform and the junction temperature profile with conventional heat sink is symmetric.

The microchannel cooling has distinct characteristics of a nonuniform temperature distribution, even under a uniform heating condition. The temperature increases along the channel in the liquid phase region due to sensible heating, and decreases in the two-phase region due to decrease of the fluid saturation pressure along the channel.

87

The junction temperature has its peak at the onsite of boiling point due to the dramatic change in convective heat transfer coefficient from a liquid-phase region to a two-phase region. The temperature difference between layers is greatly reduced by more than 10°C using microchannels because of the small thermal resistance of direct heat removal from layers.

In cases of (b) and (c), identical junction temperature distributions are presented for conventional fin heat sinks. Using microchannels, however, the temperature distribution is quite different, because of the convection nature of flow direction dependence. In both cases, the conventional heat sink presents highly nonuniform junction temperatures of about 25 and 45°C differences for layer 1 and layer 2, respectively, due to the concentrated heat flux. With microchannels, if more heat is applied to the upstream region, boiling occurs earlier resulting in increased pressure drop in the channel. Thus case (c) has a lower pressure drop, lower average junction temperature, and more uniform temperature field than case (b). In case (c), water is gradually heated up in the upstream region, where lower power dissipation is located, and downstream water boils and absorbs heat from the higher power region with low thermal resistance. Since the length of the two-phase region in case (c) is shorter than that in case (b), the overall junction temperature is lower due to a smaller pressure drop.

An interesting result for case (c) is that the junction temperature distribution is quite uniform even with highly nonuniform power dissipation, which is one of the powerful merits of the two-phase microchannel cooling.

In case (d), the microchannel heat sink has almost the same pressure drop (26.3 kPa) as in case (a). In both cases, the flow has an identical wall heat rate from the silicon wall to the fluid and the channel fluid temperature profiles are almost identical. The

88

junction temperature is determined by the heat flux and convective thermal resistance from the wall to the fluid. Layer 1 has a high temperature hump near the inlet due to high heat flux and low convective heat transfer coefficient in the single-phase region.

The highest temperature in layer 2 is lower than that in layer 1, because of the convective nature of the flow direction dependence and high two-phase convective heat transfer. Except for the temperature hump of layer 1, the overall temperature profile with a microchannel heat sink is more uniform than that using the conventional fin heat sink. In all cases with conventional cooling, the temperature of layer 2 is always higher than that of layer 1 due to larger thermal resistance to the environment.

89

Figure 5.7 Comparison of junction temperatures in a two-layer stacked circuit for the cases of an integrated microchannel heat sink and a conventional heat sink. The total flow rate of the liquid water is 15 ml/min and the mass flux is 1.36 ×10-5 kg/s [5.16].

5.2.3 Previous Works of DRAM Refresh Control

In low-power DRAMs, since the thermometer is only used during the self-refresh mode and self-refresh current is very small, one thermometer in any location could be safely used. Another concern regarding usage of the thermometer consumed large current, including dc current in the analog circuits. This problem is solved by a proper control scheme shown in Figure 5.8[5.17]. Figure 5.8(a) shows a self-refresh enable

90

signal generating a burst refresh signal shown in Figure 5.8(b). The burst refresh signal starts 8K refresh cycles with 1-μs refresh period (T1) shown in Figure 5.8(d) and the last refresh cycle stops the burst operation. The burst refresh operation at the beginning of self-refresh mode is required to initialize all the cell data to Vdd and Vss so that the cell refresh characteristics are no longer dependent on the previous data largely lost by noisy read and write operations. When the burst refresh operation is finished, thermometer is turned on and measures a temperature. Then, the refresh operation is executed according to the refresh period T2 determined by the measured temperature.

The thermometer is turned on again when 8K refresh cycles are finished, and the process continues until the self-refresh mode is ended. Since the temperature is not changed abruptly, the nontemperature- measurement period T3, which is less than one second in this design, could be enough to follow temperature variation with much smaller error than 1°C. In summary for the current issue, even though the current consumption for the thermometer during the temperature measurement period is as large as 2.4 mA, the average current is less than 1 A since one measurement cycle with 32 s (T4) is executed during the entire 8K refresh cycles.

Figure 5.8 Self-refresh and thermometer control scheme [5.17].

相關文件