Thesis Organization - 近/次臨界靜態隨機存取記憶體為基礎的先進先出記憶體設計於近身無線網路的設計和實作

Chapter 1 Introduction

1.4 Thesis Organization

The remainder of this thesis is organized as follow. Chapter 2 presents the low power SRAM memory form basic introductions to detail circuit design methodologies, including SRAM architecture, power reduction, SRAM stability, conventional low voltage SRAM limitations, and previous well-known low power SRAM designs. In chapter 3 A robust 10T near-/sub-threshold SRAM bit-cell has SNM improvement, write-ability improvement and with minimum operation voltage (V_min). Chapter 4 presents an energy-efficient low power SRAM-based FIFO memory for WBAN application. In Chapter 5, a built-in row-control dynamic voltage scaling FIFO memory is implemented for WBAN application. Chapter 6 finally concludes this thesis.

Chapter 2 Previous Low-Power SRAM Designs

2.1 Introduction

Comparing with generic logic, SRAM is simultaneously constrained by the need for very high density, low leakage, high performance, and long-term data retention. It is predicted that the power consumption of SRAM makes them be a key concern in severely energy constrained applications such as wireless implantable biomedical device. As shown in Fig. 2.1, the embedded SRAM consumes 69% of the total processor power.

Fig. 2.1 Three low-power applications culminating in SRAM consuming 69% of the chip power [2.1]

This chapter begins with the overview of SRAM operation in section 2.2. In section 2.3, the analysis of power dissipation of SRAM circuit and technique for leakage reduction will be shown. In section 2.4, stability issues of SRAM cell, including hold stability, read stability, and write ability will be defined and the impact

of variation on SRAM in low voltage will be presented. In section 2.5 and 2.6 the previous SRAM cell design and peripheral circuit technology will be described.

2.2 Overview of SRAM Operation

Fig. 2.2 shows a typical SRAM organization. The storage element is constructed with Z-block of N-row by M-bit array. Row decoder is used to decodes X address bits and select appropriate word-line. Y address bits are decoded by column decoder, and choice appropriate column. Sense amplifiers amplify bit-line swing for data sensing.

Read/write circuitry control read/write timing properly.

Fig. 2.2 SRAM organization [2.2]

2.2.1 SRAM Column Circuitry

Precharge circuit

Write signal DIN

WL PRE

BL /BLB Read signal DOUT

Write Read

Waveform of read/write operation

Fig. 2.3 A single-port SRAM column configuration example

Fig. 2.3 shows a single-port SRAM column configuration. The precharge circuit is composed of two precharge pMOSes and one equalizer. It precharges bit-line pair to level high and equalizes bit-line pair before read/write operation. Each column contains a write driver for writing input data and sense amplifier for detecting sensed data. Write driver offers complementary voltage levels to bit-line pair during write

operation. On the other hand, sense amplifier amplifies differential signals of bit-line pair by a common latch type sense amplifier. As soon as the sense amplifier is activated, the cross-coupled inverter pair latches the read data through regenerative feedback.

2.2.2 Conventional Symmetric 6T SRAM Bit-Cell

Fig. 2.4 shows the schematic of conventional symmetric 6-transistor (6T) SRAM bit-cell. The bit-cell is constructed by two cross-coupled inverters (PL, NL, PR and NR) and two passing transistors, AXL and AXR, providing read/write access to the bit-cell. A single word-line (WL) controls the connection of the bit-lines (BL and BLb) and the cross-coupled inverters by turning AXL and AXR on or off respectively.

BL WL BLb

VL VR

AXL PL

MPR

NR AXR

Device Function NL/NR Driver PL/PR Load AXL/AXR Access

Fig. 2.4 Conventional symmetric 6T SRAM bit-cell circuit

Fig. 2.5 shows the Thincell layout of conventional symmetric 6-transistor (6T) SRAM bit-cell., SRAM bit-cell layout can be optimized to minimize variability by converting the Thincell active pattern to a straight line layout which eliminates jogs and corners thus increasing reliance on metal interconnect as shown in Fig. 2.6. It

facilitates lithography and reduces sensitivity to overlay errors, which improves mismatch and critical dimension control, thus increasing stability of SRAM bit-cell.

NWELL AXL

NL PL

PR NR

AXR

BL VDD GND

GND VDD BLb WL

Fig. 2.5 Conventional symmetric 6T SRAM bit-cell layout

Fig. 2.6 (a) Thincell and (b) straight line layout (c) SNM comparison [2.3]

For read operation, BL and BLb are precharged to high. Then, the WL is activated, and one of the bit-line will be pulled down by the cell. For example, in Fig. 2.7, VL=0 and VR=1, BL will be pulled down through transistors AXL-NL, while BLb stay high.

A differential signal is generated on the bit-line pairs, and the sense amplifier at the read output end will detect this small signal and transforms it into full swing voltage.

BL=floating WL=1 BLb=floating

VL=0 VR=1

AXL

Fig. 2.7 Read example of 6T SRAM cell.

For write operation, on bit-line is driven high and the other low. Then, WL is turned on, and data on bit-line will overpower the cell content with the new value. For example, in Fig. 2.8, VL=0, VR=1, BL=1,and BLb=0, VL will rise high, and VR will be forced to low.

BL=1 WL=1 BLb=0

VL=0à1 VR=1à0

Fig. 2.8 Write example of 6T SRAM cell.

2.3 Power Dissipation

This chapter begins with an analysis of power dissipation of CMOS circuit and circuit technique for power dissipation. Power dissipation combines with dynamic power (P_dynamic), leakage power (P_leakage), and short circuit power (Pshort-circuit). Power

could be expressed as

P_total = P_dynamic+ P_leakage+ Pshort−circuit (2.1)

,where P_dynamic=αC_LV_DD²f, P_leakage = V_DDI_leakage, and Pshort-circuit=I_meanV_DD According to the equation above, dynamic power dissipation is proportional to square of supply voltage and both leakage power and short-circuit power are proportional to supply voltage.

2.3.1 Dynamic Power

The cause of dynamic power is logic transition of CMOS circuits which charges or discharges its load capacitance and parasitic capacitance (C_L). As can be seen in (2.1), the dynamic power dissipation is direct proportion to switching activity factor (α), capacitance load (C_L), squire of supply voltage (V_DD²), and operating frequency (f).

2.3.2 Leakage Power

Subthreshold Punchthrough

GIDL

Reverse bias diode Gate Oxide

Tunneling Gate

Source Drain

n+ n+

Well P

Fig. 2.9 Leakage current of deep-submicron transistors

Leakage power is a significant portion of the total power consumption in modern ICs. Integrated circuits including vast numbers of logics which are not actively

switching still consume power because of leakage currents. Fig. 2.9 shows reverse-biased junction leakage, subthreshold leakage, gate direct-tunneling leakage, injection of hot carriers from substrate to gate oxide, gate induced drain leakage, and punchthrough leakage in the deep scaling transistor [2.4] [2.5].

Junction Leakage

Leakage in reverse biased transistors includes the effects of carrier generation, related to residual damage density and location relative to the junction boundary, as well as structure and bias dependent effects of gate oxide leakage, band-to-band tunneling at the drain junction. Junction leakage current depends on the area of the drain diffusion and the leakage current density. For low power CMOS technology, high channel and halo doping greatly increase junction leakage.

Gate-induced drain leakage (GIDL) occurs at high electric field between drain and gate terminal. Thinner oxide, higher supply voltage and lightly doped drain structures increase GIDL effect.

Subthreshold Leakage

When gate voltage is below the threshold voltage, sub-threshold leakage or weak inversion current occurs between source and drain. For example, an off state inverter, although the Vgs of the NMOS is 0V, there is a light current (leakage) flowing from the drain to source due to the voltage V_DD across V_ds.

Sub-threshold behavior can be modeled physically as show in the following [2.6]

𝐼_𝑑𝑠 = 𝜇^𝑊_𝐿 (^𝑘𝑇_𝑞)²𝐶_𝑠𝑡ℎ𝑒^{𝑉𝑔−𝑉𝑇+𝜂𝑉𝑑𝑠}^{𝑚𝑘𝑇 𝑞}^⁄ (1 − 𝑒⁻^{𝑘𝑇 𝑞}^𝑉𝑑𝑠^⁄ ) , 𝑚 = 1 +^𝐶_𝐶^𝑠𝑡ℎ

𝑜𝑥 (2.2)

Where W and L denote the transistor width and length, μdenotes the carrier mobility, Csth = Cdep = Cit denotes the summation of the depletion region capacitance and the interface trap capacitance both per unit area of the MOS gate, η is the drain induce barrier lowering (DIBL) coefficient, and Cox denote the gate input capacitance per unit

area of the MOS gate.

Sub-threshold leakage increases exponentially with the reduction of the threshold voltage and DIBL would lower threshold make leakage even worse. On the other hand, sub-threshold can be drop with increasing the threshold voltage. In low power technology we can use high V_th technology transistor to reduce sub-threshold leakage in off state.

Gate Direct Tunneling Leakage

Ultra-thin gate oxide thickness used for effective gate control in deep scaling CMOS technology. However, the high electric field in the low gate oxide thickness result in directing tunneling of electron from substrate to gate and also from gate to substrate through the gate oxide [2.7]. As seen in Fig. 2.10, the components of tunneling current could be classified in to three categories, edge direct tunneling leakage, gate-to-channel leakage, and gate-to-substrate leakage.

Fig. 2.10 Gate direct tunneling leakage [2.7]

There are new device structures and materials such as high-k metal gate [2.8], double-gate device [2.9] and Fin FET [2.10] for alleviating gate tunneling leakage by as much as an order of magnitude.

Punchthrough Leakage

Finally, in short-channel devices, due to the proximity of the drain and the source, the depletion regions at the drain-substrate and source-substrate junctions extend into the channel. As the channel length is reduced, if the doping is kept constant, the separation between the depletion region boundaries decreases. An increase in the reverse bias across the junctions (with increase in V_DS) also pushes the junctions nearer to each other. As the combination of channel length and reverse bias leads to the merging of the depletion regions, punchthrough leakage occurs.

2.3.3 Short-Circuit Power

Short circuit power (Pshort-circuit=I_meanV_DD) is due to nonzero rise and fall time of input waveforms in which a direct path current flowing from the power supply to the ground during the switching of a static CMOS gate. I_mean is the mean value of the short circuit current. Assuming a symmetrical inverter and using simple MOS formula, Short circuit power is modeled as [2.11]

 

f V V

P_short_circuit ( _DD 2 _t)³

12 

  (2.3)

where β denotes the gain factor of a transistor, f denotes the operating frequency, and τ is the input rise/fall time.

2.3.4 Low power SRAM design technology

For low power systems, the power delay trade-off is not sufficient to achieve the desired power consumption. Generally, such systems do not require high performance.

Hence, other methods are used for reducing power dissipation. The features of these low power techniques are reducing above three components as far as possible.

Dynamic Power Reduction

To reduce the dynamic power consumption, there are some SRAM design strategies.

Firstly, [2.12] [2.13]utilize hierarchical bit-lines with short local bit-lines to reduce bit-line load (C_L) and with global bit-lines typically resetting to logic low in order to deduce the switching activity factor (α). Secondly, in a thin-cell layout (Fig. 2.5) approach, the vertical dimension is determined by the poly pitch while lateral dimension is determined by the device sizing. In general, the SRAM bit-cell area is dominated by the contact and the diffusion spacing. Various industrial minimum-sized 6T bit-cell layouts reveal that only 30%-35% of lateral dimension is used for the contact and diffusion spacing [2.14] [2.15]. The vertical dimension along the bit-line is unchanged, thus bit-line capacitance (CL) is minimum (2 poly-pitch) for the bit-cell upsizing. For lower power, in some SRAM design, thirdly, the cell supply (V_CS) of SRAM bit-cells and critical peripheral circuits is higher than the other peripheral circuits so that the dynamic power of non-critical peripheral circuits could be reduced.

Fourthly, [2.13] presents single-end write-bit-line structure, which reduces the switching activity factor (α) to less than “0.5” to diminish the dynamic write power since the most of bits in caches are logic low. Fifthly, [2.16] shows a new low-power SRAM using bit-line Charge Recycling (CR-SRAM) for the write operation. The differential voltage swing of a bit-line is obtained by recycled charge from its adjacent bit-line capacitance, instead of the power line. If we assume that the number of CR bit-line pair is N, all bit-lines have the same capacitances, and i=1, 2 …, N, the voltage in the bit-line pair becomes *^{2 −2𝑖}₂ +V_DD. The N bit-line pairs consume the

power of ( ) 𝐶_𝐿𝑉_𝐷𝐷² per clock cycle instead of 𝑁𝐶_𝐿𝑉_𝐷𝐷² , Thus significantly reduces write power.

Leakage Power Reduction

Although threshold voltage is reduced to achieve higher drive current and hence

better speed, but the cost is the significantly increasing stand-by power. Hence, to suppress the power consumption in low-voltage circuits, it is necessary to reduce the leakage power.

Transistor stacking is effective in leakage reduction [2.17], so [2.18] use power-gating structure (transistor stacking) to reduce the leakage current of sleep or shut-off SRAM cells. In [2.19], the gate of the word-line driver transistors are left floating, and the voltage level discharged by junction leakage in standby, thus reducing gate leakage. In [2.20], read/write bit-line is left floating in sleep mode without being precharged and the bit-line leakage compensation scheme provides the compensation pull-up current to read bit-line in read operation, which will minimize the leakage current on bit-line. The 7T SRAM cell in [2.21] uses multiple V_t structure in which high-Vt devices are utilized to reduce leakage current. In [2.22], additionally, dynamic V_t technique is employed to increase the pull-down current while reduce the leakage during standby.

In conclude to the low power strategies mentioned above, supply voltage scaling is the most effective way to reduce the total power consumption. [2.23] [2.24]present low power and high performance SRAM which operates at high performance below supply voltage of 1V (with 0.7V/0.5V supply). The energy efficient SRAM in [2.25] lowers the supply voltage to the data retention voltage (DRV) in stand-by mode, so that the leakage power saving are maximized.

Sub-threshold voltage operation has been proven to minimize energy per operation for logic. Therefore, ultra-low voltage SRAM operating in the subthreshold region is recommended. However, variation and current ratio become more critical in ultra-low voltage. Different topologies and certain peripheral assist circuits [2.26] - [2.37] are used to address this challenges and details would be discussed in section 2.6. Though

these sub-threshold SRAMs achieve very low power consumption, however, they sacrifice high operating frequencies for lowering the power. An SRAM designed for operating in sub-threshold and super-threshold regions is presented in [2.38].

Reconfigurable circuit assists are used to address ultra-low voltage challenges and to minimize their adverse effect at high-voltage operation.

2.4 SRAM Bit-Cell Stability

Reliability has always been a major concern for SRAM bit-cell. As technology and scaling down, process, voltage, temperature (PVT) variations are more non-ignorable particularly in ultra-low supply voltage. Therefore, accurate estimation of SRAM data storage stability in pre-silicon design stage and verification of SRAM stability in the post-silicon testing stage are important steps in SRAM design and test flows. The following of this section will state the most widely adopted SRAM cell stability definition.

2.4.1 Static Noise Margin (SNM)

The most common method to measure the stability of SRAM cells is hold/read static noise margin (SNM). Hold static noise margin is defined as the maximum value of static DC voltage noise which can be tolerated by the SRAM bit-cell without flipping the storage node when word-line turns off. Fig. 2.11 shows the standard setup for modeling hold SNM. DC noise sources V_N are introduced at each of the internal nodes in the bit-cell.

On the other hand, Fig. 2.12 shows the standard setup for modeling Read SNM.

Word-line (WL) turns on for read access, and bit-line (BL) and bit-line-bar (BLb) are set to VDD to indicate that the initial value of bit-lines is pre-charged to high. Higher read

SNM implies higher noise tolerance of SRAM bit-cell during read operation. Fig. 2.11 and Fig. 2.12 also show the example of butterfly curves during hold and read, revealing the degradation in SNM during read

0 VDD

VR (V)

VL (V)

WL=0

BL=VDD BLb=VDD

VL VR

Fig. 2.11 Standard setup for finding Hold SNM

0 VDD

VR (V)

VL (V)

WL=VDD

BL=VDD BLb=VDD

VL VR

Fig. 2.12 Standard setup for finding Read SNM

2.4.2 Write Margin (WM)

Write margin is defined as 𝑉_𝐷𝐷− 𝑀𝐼𝑁[𝑉(𝑊𝑊𝐿)] . 𝑀𝐼𝑁[𝑉(𝑊𝑊𝐿)] is the minimum write-word-line voltage required for flipping the bit-cell. The higher write margin, the easier the data is written into bit-cell. Fig. 2.13 shows a corresponding example of finding write margin. The write margin is defined as the VDD - VWL value at the point when VR and VL flip. The write margin value and variation is a function of the

cell design, SRAM array size and process variation. A cell is considered not writeable if the worst-case write margin becomes lower than the ground potential.

VWL(V)

Fig. 2.13 Write margin of a SRAM bit-cell

2.4.3 Impact of Variation on SRAM in Low Voltage

Differential 6T SRAM

The 6T bit-cell fails to operate at ultra low voltages because of reduced signal levels and increased sensitivity to random dopant fluctuation. In this configuration, both read and write accesses are opposite making it highly difficult to overcome the severe effect of variation and manufacturing defects. Fig. 2.14 shows the β ratio of 6T SRAM bit-cell and the β ratio conflict will be described afterward.

BL WL BLb

During read access the cell must remain bi-stable to ensure that both data logic value can be held and read without being upset by read disturb that occur at the internal nodes. In order to facilitate read and minimize read disturb, the β₂ ratio should be small enough by strong PD nMOS and weak AX nMOS. During write access the cell should be made mono-stable to write the desired data. For improving writability, the β3 ratio must be large by strong AX nMOS and weak PUP pMOS.

For improving writability and minimizing read disturb simultaneously, the transistor can be sized as PD > AX > PUP. However, it would degrade the β1 ratio hence the V_TRIPresult in poor read SNM. Therefore, these three β ratios are conflict to each other, simply sizing could not solve 6T SRAM failures.

Hold and Read Failure

Hold failure happens if the destruction of the cell content in the standby mode at a low supply voltage. Therefore higher trip point of back-to-back makes the cell easier to flip, thereby increasing the hold failure probability. As shown in Fig. 2.15, it is preserved to very low voltages and will form the basis for several of the ultra-low voltage bit-cell design described in section 2.5 and 2.6.

Fig. 2.15 6T SRAM SNM loss at low voltages [2.39]

If the data stored in an SRAM cell flips during reading, there is a read failure. If the voltage rise at the node storing “0” and higher than the trip point of the back-to-back inverter, then the data stored in the cell would flip over. Fig. 2.15 shows that the 6T SRAM bit-cell fails to operate at low voltages because of reduced signal levels and increased variation. At low voltages, the read SNM is negative, indicating loss of stability.

Write Failure

If the data stored in an SRAM cell can‟t be flip during writing, there is a write failure. While writing “0” to node storing “1,” the voltage at the node need to be discharged below the trip point of the back-to-back inverter. As shown in Fig. 2.16, it is also lost at low voltage, where a positive value, in this case, indicates write failures.

Fig. 2.16 6T SRAM write margin [2.39]

Access Failure

If the voltage difference between the two bit-lines (dual-end) or the voltage drop of the single bit-line (single-end) can‟t be sensed by the sense amplifier during the

在文檔中近/次臨界靜態隨機存取記憶體為基礎的先進先出記憶體設計於近身無線網路的設計和實作 (頁 17-0)