• 沒有找到結果。

Crosstalk-insensitive via-programming ROMs using content-aware design framework

N/A
N/A
Protected

Academic year: 2021

Share "Crosstalk-insensitive via-programming ROMs using content-aware design framework"

Copied!
5
0
0

加載中.... (立即查看全文)

全文

(1)

Crosstalk-Insensitive Via-Programming ROMs Using

Content-Aware Design Framework

Meng-Fan Chang, Lih-Yih Chiou, and Kuei-Ann Wen, Senior Member, IEEE

Abstract—Various code patterns of a via-programming read only memory (ROM) cause significant fluctuations in coupling noise between bitlines (BLs). This crosstalk between BLs leads to read failure in high-speed via-programmable ROMs and limits the coverage of applicable code patterns. This work presents a content-aware design framework (CADF) for via-programming ROMs to overcome the crosstalk induced read failure. The CADF ROMs employ a content-aware structure and correspondent code-structure programming algorithm to reduce the amount of coupling noise source while maintaining nonminimal BL load for crosstalk reduction. A 256-Kb conventional ROM and a 256-Kb CADF ROM were fabricated using a 0.25- m logic CMOS process. The measured results ascertain that the read induced read failure is suppressed significantly by CADF. The CADF ROM also reduced 86.2% and 94.5% in power consumption and standby current compared to the conventional ROM, respectively. Index Terms—Code patterns, crosstalk, read only memory (ROM).

I. INTRODUCTION

R

EAD-ONLY memories (ROMs) are commonly embedded into system-on-chip (SoC) designs for storing programs and predefined data. Compared to those coded by diffusion and poly layers, via-programming ROMs (via-ROMs) shorten the turnaround time in manufacturing after code modification and is popular in today’s designs. Unfortunately, various code patterns in via-ROMs or contact-programming ROMs produce large fluctuations in BL loading and coupling capacitances. To achieve a high speed, the wordline pulse width in a ROM is short, and the sensing margin becomes small and vulnerable to noise. This code-pattern-dependent crosstalk-induced read failure (CIRF) limits the speed and code-pattern coverage for via-ROMs.

To reduce power consumption and to improve speed perfor-mance of ROMs, previous works employed schemes, such as block/row inversion [1], precharge-discharge dynamic CMOS logic [2], charge recycling [3], and restricted bitline (BL) swing [4]–[6]. These studies have not addressed their solutions to the CIRF across different code patterns. In this brief, we inves-tigate this problem and present content-aware design frame-work (CADF) to overcome CIRF for high-speed via-ROMs. The

Manuscript received June 22, 2005. This paper was recommended by Asso-ciate Editor R. Puri.

M.-F. Chang is with the Intellectual Property Library Company, Hsin Chu 300, Taiwan, R.O.C. (e-mail: [email protected]; [email protected]).

L.-Y. Chiou is with the Electrical Engineering Department, National Cheng-Kung University, Tainan 701, Taiwan, R.O.C. (e-mail: [email protected]). K.-A Wen is with the Institute of Electronics, National Chiao Tung University, Hsin Chu 300, Taiwan, R.O.C. (e-mail: [email protected]).

Digital Object Identifier 10.1109/TCSII.2006.873640

Fig. 1. Structure of conventional ROMs. The wordline (WL), ground line (VSS), code layer, and BL are implemented by poly, diffusion (OD), via, and metal-2, respectively.

CADF ROMs reduce the coupling noise source and maintain nonminimum BL load to reduce the amount of corsstalk. Fur-thermore, the CADF ROMs consume lower power and higher speed than conventional via-ROMs. To our knowledge, this is the first to deal with the code-dependent read failure induced by crosstalk between BLs for via-ROMs.

The remainder of this brief is organized as follows. Section II analyzes the behavior of the code-dependent CIRF. Section III presents the proposed CADF. Section IV presents the experi-mental results. Section V draws conclusions.

II. CROSSTALKINDUCEDREADFAILURE

Crosstalk between bit lines erodes the sensing margin and leads to read failure in high-speed via-ROMs. The code-depen-dent CIRF limits the coverage of applicable code patterns in via-ROMs.

A. Sensing Margin

In conventional via-ROMs, as shown in Fig. 1, the value of the stored datum is determined by the connection to a via layer. A bit cell with a via layer that connects its nMOS transistor to its BL stored the datum 0 (0-cell), a cell without such a layer stored the datum 1 (1-cell). When a row is activated by the relevant wordline (WL), the 0-cell sinks current from the bit line, while the 1-cell does not sink any current. For a 0-cell, the parasitic capacitance on the drain side of the transistor is connected to its BL. For a 1-cell, the parasitic capacitance on the drain side of the transistor is not seen by its BL.

In ROMs, all BLs are precharged to a pre-determined voltage prior to the data-sensing phase of a cycle. In the data-sensing phase, a BL is discharged to develop a voltage drop for

(2)

reading a 0-cell or remains at the precharged voltage, , for reading a 1-cell. To differentiate a 0-cell from a 1-cell, the sense amplifier needs a reference voltage whose value must be set between and . Clearly, the sensing margin for reading a 0-cell is and the sensing margin for reading a 1-cell is . The fixed-value (FV) scheme [4] and the half-rate BL-tracking (HRBLT) schemes [5], [6] are the two most popular ways for providing in ROMs. The FV scheme, which provides a fixed value in during the entire data-sensing phase, is employed in this work since it is suitable for experiments. If a coupling noise drop happens on a BL, it may reduce the value of .

Various code patterns cause different amount of parasitic ca-pacitance on BLs in via-ROMs [7]. The BL whose bit cells are all 0-cells incurs the largest load effect in via-ROMs. Given wordline pulsewidth and cell current , the voltage drop of a BL can be derived as

(1) The and , as shown in Fig. 1, are the cou-pling capacitance between and its two adjacent BLs, and . For a given , the BLs with large have small .

To detect all bit cells correctly, the minimum of , denoted as , and the maximum of , denoted as , deserve our attention. The voltage drops and across code patterns must satisfy the inequalities

and simultaneously by noting that

both and must be larger than zero. From (1), must be long enough to generate enough to satisfy . B. Crosstalk Between BLs

In nominal ROMs, all bit cells on the same row are controlled by the same wordline. Thus, there are on both the selected and unselected BLs during the data-sensing phase. The unse-lected BLs and , which are the neighboring BLs of as shown in Fig. 1, have developed by 0-cells

when is on. The on and

gen-erate through and onto the selected BLs,

. Accordingly, the neighboring BLs are the aggressors and the selected BLs are the victims in crosstalk effects.

Fig. 2(a) shows the simulated waveform of BL without

crosstalk effect with V and V. The

and read 0-cells but with light (with 511 1-cells) and heavy (with 512 0-cells) BL loads, respectively. The BL without any crosstalk retains the during the data-sensing phase and has the correct sensing result. Fig. 2(b) shows the example of CIRF. The causes the voltage on the victim BL, (reads a 1-cell), being lower than the . Then, 0-cell is mistakenly detected by a sense amplifier rather than the expected 1-cell for .

Coupling capacitance between BLs, BL load, and the am-plitude of aggressor voltage are the key parameters for the crosstalk effect between BLs. Unfortunately, the coupling capacitance, which is determined by the spacing between BLs,

Fig. 2. Simulation waveforms of BLs (512 cells) in conventional ROMs: (a) without and (b) with crosstalk.V = 2:5 V and V = 2:25 V.

Fig. 3. SimulatedV and V versus the number of 0-cells on a BL. (V = 0  2:5 V, V = 2:5 V).

becomes larger as the minimum-space between metal lines is smaller in advance technology nodes.

As discussed earlier, various code patterns on a BL result in various and on BLs for a given . The maximum of , denoted as , could be a full swing of the (e.g., ) for a BL with minimum load. The is only a few hundred millivolts for a BL with maximum load (e.g., ). These various on unselected BLs, acting as the aggressor voltages in crosstalk, generate various level of on their neighboring BLs. Moreover, long results in large for a BL. Hence, both and code patterns affect the and . Furthermore, the crosstalk between BLs is also dependent on the of the victim BLs. The coupled voltage on a victim BL is derived in (2). The on victim BLs is large with small intrinsic BL load (less number of 0-cells) across various code patterns

(2)

Fig. 3 shows the simulated (with V)

and on a BL (with 512 cells) versus various BL loads. Since both and must be larger than zero for correct sensing, the and determine the applicable values of . The employed in Fig. 3 is 2 ns. The is the voltage swing of the BL has maximum (512 0-cells). The on victim BLs can be smaller than if their is small (e.g.,

(3)

V or smaller). For those victim BLs whose are larger than (see the dilemma area in Fig. 3), there is no can be found to correctly differentiate 0-cells from 1-cells for those BLs. In this example, the victim BLs with less than 96 0-cells suffer read 1-cell failure if is equal to .

In summary, a BL has small and large suffer large and CIRF. Various code patterns generate large fluctuation in BL loads. The is large when is required to satisfy a given because of the large fluctuation in BL loads in via-ROMs. Thus, CIRF is dependent on the data patterns on BLs, and limits the coverage of applicable code patterns on a via-ROM.

III. CONTENT-AWAREDESIGNFRAMEWORK

To reduce the aggressor voltage and avoid minimum for crosstalk suppression in via-ROMs, the fluctuation in BL load must be further reduced. The CADF optimized the structure and data patterns of via-ROMs to achieve high uniformity in parasitic capacitance and voltage swing on BLs across various code patterns. The CADF consists of the content-aware struc-ture (CAS) and code-strucstruc-ture programming algorithm (CSPA). The CAS provides the infrastructure for CSPA.

A. CAS

The CAS comprises hybrid-segmented BLs (HSB), flag ta-bles and dual-path output drivers.

In HSB, each column (BL) has a base segment (base-BL) and numerous segments (sub-BLs). When a column is accessed, its base-BL and only one of its sub- BLs are accessed. There are bit cells in a base-BL. A column, excluding its base-BL, is phys-ically divided into numerous segments by local BL switches (LBSs) according to the to-be-stored ROM code and the CSPA. The maximum number of divided segments on each column, , is the predefined parameters based on the memory config-uration, area limitation and crosstalk consideration. The value of is equal to 2, 4, or 8 in CADF. Each sub-BL originally has 2 bit cells with the exception of the bottom sub-BL. The bottom sub-BL has (2 ) bit cells and is next to the base-BL. Fig. 4 illustrates the CAS for a ROM macro with rows and columns. The value is defined as

if

if (3)

Herein, the row step ( ) specifies the minimum number of steps increased on the number of rows due to the structure of leaf-cells in ROMs. The value of rs is equal to 4 in this brief. The number of sub-BL on a BL is derived by CSPA and varies from column to column if the code patterns on BLs are different. Namely, the cell array consists of BLs that are hybrid segmented.

An LBS, controlled by segment selection signals (SS), com-prises two nMOS transistors and one shared contact that con-nects its top and bottom sub-BLs to a BL. The dummy LBS

Fig. 4. CAS of CADF ROM (withk = 4). A simplified array is included with selected/unselected BL.

(DLBS), which has the same width and height as the LBS, has no device but a vertical metal layer to connect its top and bottom sub-BLs. The DLBS is used to replace a LBS in CSPA.

For each BL, a -bit table stores the flags that indicate the data-inversion status of each sub-BL and base-BL. Dual-path output drivers utilized the output from flag table to select the normal or inverted output path. If a flag signifies true for an accessed sub-BL, then the output driver select the inversion path from the output of the sense amplifier, and the codes read from the CADF ROM are still correct.

B. CSPA

A four-step algorithm, CSPA, was developed for code pro-gramming and sub-BL structuring in CADF. The CSPA reduces fluctuations but maintain the minimum amount in BL load. The procedure of the proposed CSPA is explained as follows.

Step 1) The input ROM code is initially programmed into HSB based on user defined parameters and . Step 2) If the number of bit cells with code 0 on a sub-BL

exceeds , then the CSPA assigns 1-cells for the bit cells with code 0 on this sub-BL. The correspondent flag bit of this sub-BL is set to 1. Otherwise, the CSPA assigns 0-cells for bit cells with code 0. The correspondent flag bit of this sub-BL is set to 0. Step 3) If the number of 0-cells on consecutive sub-BL

ex-ceeds , these sub-BLs are merged. The LBSs as-signed at step 1 for these merged sub-BLs are re-placed by DLBSs.

Step 4) For each column, if the number of 0-cells on a sub-BLs AND its base-BL are both smaller than , then the CSPA assigns 1-cells for the bit cells with code 0 on the base-BL of this column. The flag bit of this base-BL is set to 1.

The merge activity in Step 3) makes the number of 0-cells on each sub-BL close to as possible to reduce the fluctuation in accessed BL load across cycles. Step 4) ensures the minimum load of a BL exceeds to reduce the crosstalk between BLs. After Step 4), the maximum and minimum number of 0-cells on a BL in CADF ROM is and , respectively. In conventional ROMs and the inverted ROM [1], the maximum

(4)

Fig. 5. Example of CSPA, wherek = 4; m = 2, and n = 1.

and minimum number of 0-cells on a BL in CADF ROM is and zero, respectively.

In the example of CSPA depicted in Fig. 5, a BL was initially divided into a base-BL and four sub-BLs by two LBSs. In Step 2), the data in sub-BL[0], sub-BL[1], and sub-BL[3] were inverted because their number of code 0 were both larger than . The Flag0, Flag1, and Flag3 are set to 1. In Step 3), since the total number of 0-cells on sub-BL[3]-sub-BL[1] did not exceed , these three sub-BLs were merged and only one LBS was utilized for this column instead of two. The removed LBS is replaced by a DLBS. Since both the number of 0-cells on the sub-BL[0] and base-BL did not exceed [in Step 4)], the data on the base-BL was inverted and Flag-base is set to 1. Hence, the load on a CADF BL was one or three 0-cells across cycles and significantly smaller than the original code (eleven 0-cells) in conventional ROMs.

Therefore, the data patterns and the structure of HSB are “smartly” programmed based on the ROM code. Despite the small load on a BL, the fluctuations in BL load for various code patterns, 0-cells, in CADF ROM is much smaller than that ( 0-cells) in conventional ROM.

C. Crosstalk Suppression by CADF

Since CADF ROMs achieve small BL loads, short is needed to generate required for a given . Generally, is equal to a few hundred millivolts in high-speed ROMs. On the other hands, the difference between and in CADF ROMs is small due to the high uniformity feature of BL loads across various code patterns. The smaller the fluctuation of the load on a BL, the smaller the can be achieved in the CADF ROM. Thus, can be limited to a few hundred millivolts in CADF ROMs, rather than the full swing of as in the conventional ROMs. A large value of results in a small value of , as derived in (1), and small across code patterns. Therefore, the maximum value of , which equals to , is significantly reduced in CADF ROMs when is large.

Fig. 6. Simulated maximum coupled voltage drops(V ) on a victim BL (512 cells) with various values fork and n.

Fig. 7. Die photos of a 256-Kb conventional (Std) ROM with X pattern and a 256-Kb CADF ROM (k = 8; n = 16, and m = 32) with X pattern.

As discussed in Section II-B, the minimum number of 0-cells on a BL in CADF ROM is rather than zero 0-cells as in con-ventional ROMs. This feature, nonzero 0-cells on a BL, increase the minimum BL load and enhance the immunity of coupling noise in CADF ROMs. The optimized value of is dependent on the technology node, predefined and the value of .

As both the values of and increase, the crosstalk reduction becomes more significant. Fig. 6 shows the simulated maximum

on a victim BL (512 bit cells) with various values for and . For a CADF ROM with small value, increasing the

value effectively reduced the coupled voltage on a victim BL. However, the trend of reduction is saturated when is large. Moreover, employing large values enable a CADF ROM can have a small value for a given tolerated .

With appropriate value of and , which result in small and nonminimum , the maximum on a victim BL can be always smaller than in CADF ROMs. A can be found without the dilemma area depicted in Fig. 3. Thus, the pattern-dependent CIRF can be avoided and the code-pattern coverage is increased by CADF.

IV. EXPERIMENTALRESULTS

A 256-Kb CADF ROM ( , and

) and 256-Kb conventional ROM were fabricated in 1P5M 0.25- m CMOS technology, as shown in Fig. 7. The X pattern was applied on both the experimental ROMs for in-vestigating the CIRF.

Since the in CADF ROM were much smaller than that in conventional ROM, the is about 500 mV rather than 2.5 V as in conventional ROM. Both the access time and the cycle time of the CADF ROM were also shorter than those of the conventional ROM thanks to the small precharge time and . The access time of a 256-Kb CADF ROM was only 55.6% of the conventional ROM.

(5)

TABLE I

FUNCTIONALITYTEST OFFABRICATEDROMSWITHXPATTERN

The suppression of crosstalk effects by CADF was demon-strated through functional verification with the X pattern. Nine levels of crosstalk effect (including the minimum and maximum cases) are included in the X pattern, X1–X9. The aggressor BLs read 0-cells and the victim BLs read 1-cell during the testing. The measured results of the nine test patterns for the fabricated conventional (Std) and the CADF ROMs are shown in Table I. The conventional ROM failed to sense the X1, X4, X7, X8, and X9 while the fabricated CADF ROM passed the X pattern.

The measured standby current for a fabricated 256-Kb CADF ROM was only 0.02 A, compared to 0.36 A for the 256-Kb conventional ROM with X pattern, a reduction of 94.5%. There-fore, CADF is also good for nanometer technology to resolve the issue on subthreshold leakage due to its less number of leakage paths (0-cells in a activated column) in cell arrays than the one in conventional ROMs.

The comparison of the performance of a 256-Kb CADF ROM to other low-power approach is shown in Table II. Since different works had various speed and power performance for various sizes of memory capacity, we adopt the power-delay product (PDP) per bit (PDP-bit) for fair comparison. The maximum PDP-bit of CADF ROM, 0.06 pico-Joule per bit, was much smaller than those obtained in other reports.

V. CONCLUSION

The code-pattern-dependent crosstalk induced read failure have been investigated. A CADF is proposed to overcome

TABLE II

COMPARISONS OFEXPERIMENTALROMS ANDPREVIOUSREPORTS

the CIRF and improve the power and speed performance for via-ROMs. Fabricated 256-Kb CADF ROM demonstrated its effectiveness and achieved 100% code coverage under high-speed operation. The fabricated CADF ROM also reduces 86.2% in power consumption and 94.5% in standby current with 2.7% area penalty compared to the fabricated conventional ROM. The CADF ROM had improved 55.6% in access time. Furthermore, the CADF had a smaller power-delay product to memory capacity ratio than previous techniques.

REFERENCES

[1] E. de Angel and E. E. Swartzlander, Jr., “Survey of low-power tech-niques for ROMs,” in Proc. IEEE Int. Symp. Low Power Electron. Des., Aug. 1997, pp. 20–20.

[2] C.-R. Chang, J.-S. Wang, and C.-H. Yang, “Low-power and high-speed ROM modules for ASIC applications,” IEEE J. Solid-State Circuits, vol. 36, no. 10, pp. 1523–1523, Oct. 2001.

[3] B. D. Yang and L. S. Kim, “A low-power ROM using charge recycling and charge sharing technique,” IEEE J. Solid-State Circuits, vol. 38, no. 4, pp. 653–653, Apr. 2003.

[4] A. Tuminaro, “A 400 Mhz, 144-Kb CMOS ROM macro for an IBM S/390-class microprocessor,” in Proc. IEEE Int. Conf. Comp. Des., Oct. 1997, pp. 255–255.

[5] M.-F. Chang, K.-A. Wen, and D.-M. Kwai, “Supply and substrate noise tolerance using dynamic tracking clusters in configurable memory de-signs,” in Proc. IEEE Int. Symp. Qual. Electron. Des., Mar. 2004, pp. 302–302.

[6] T. Tsang, “A compilable read-only-memory library for ASIC deep sub-micron applications,” in Proc. IEEE 11th Int. Conf. VLSI Design, Chennai, India, Jan. 1998, pp. 494–494.

[7] M.-F. Chang, L.-Y. Chiou, and K.-A. Wen, “A low supply noise con-tent-sensitive ROM architecture for SOC,” in Proc. IEEE Asia-Pacific

Conf. Circuits Syst., Dec. 2004, pp. 1024–1024.

[8] R. Sasagawa, I. Fukushi, M. Hamaminato, and S. Kawashima, “High-speed cascode sensing scheme for 1.0 V contact-programming mask ROM,” in Proc. IEEE Symp. VLSI Circuits, 1999, pp. 96–96.

數據

Fig. 1. Structure of conventional ROMs. The wordline (WL), ground line (VSS), code layer, and BL are implemented by poly, diffusion (OD), via, and metal-2, respectively.
Fig. 3. Simulated V and V versus the number of 0-cells on a BL. (V = 0  2:5 V, V = 2:5 V).
Fig. 4. CAS of CADF ROM (with k = 4). A simplified array is included with selected/unselected BL.
Fig. 6. Simulated maximum coupled voltage drops (V ) on a victim BL (512 cells) with various values for k and n.
+2

參考文獻

相關文件

▪ Step 2: Run DFS on the transpose

6 《中論·觀因緣品》,《佛藏要籍選刊》第 9 冊,上海古籍出版社 1994 年版,第 1

Step 3 Determine the number of bonding groups and the number of lone pairs around the central atom.. These should sum to your result from

After students have had ample practice with developing characters, describing a setting and writing realistic dialogue, they will need to go back to the Short Story Writing Task

Robinson Crusoe is an Englishman from the 1) t_______ of York in the seventeenth century, the youngest son of a merchant of German origin. This trip is financially successful,

Step 1: With reference to the purpose and the rhetorical structure of the review genre (Stage 3), design a graphic organiser for the major sections and sub-sections of your

fostering independent application of reading strategies Strategy 7: Provide opportunities for students to track, reflect on, and share their learning progress (destination). •

Now, nearly all of the current flows through wire S since it has a much lower resistance than the light bulb. The light bulb does not glow because the current flowing through it