Chapter 6 Power Integrity for Heterogeneous 3D Integration (Case Study)
6.3 Power Delivery for the Processor Memory Stack
6.3.3 TSV planning
The power TSVs of the processor memory stack structure can be categorized into three groups by their supply voltages. The first group, named the major power TSV group, delivers the unregulated 1.1V supply to local voltage regulators of multi-core processor, SRAM, and DRAM array cells. The second and the third power TSV
Circuits Current load model IMAX ILeakage (Tr, Tf, T)
Per processor core Triangular waveform 100mA 30mA (100ps, 150ps, 500ps) Per processor block Triangular waveform 22.5mA 9mA (1ns, 1.5ns, 5ns) 128Mb DRAM Triangular waveform 750mA(cell)
555mA(peri.)
250mA(cell) 165mA(peri.)
(1.5ns, 1.5ns, 5ns) (1.5ns, 1.5ns, 5ns) Front-End circuit Sinusoidal waveform 200mA 80mA (N/A, N/A, 400ps)
groups are used as global power supply sources of the DRAM peripheral circuits and the front-end circuits, respectively.
In order to minimize the area occupied by power TSV groups, an area-efficient power TSV planning method is proposed in chapter 3. Review the equation derived in section 3.3.4, the supply noise in a T-layer 3D structure can be estimated as following equation.
(6.1).
In this structure, the filling material of power TSV is assumed to be copper and the height of power TSV is set 100μm. The other related design parameters are shown in Table 6.3. By substituting the parameters into the equation (6.1), we can obtain the most appropriate pair numbers and diameter of each power TSV group.
Table 6.3. Parameters and results of TSV planning.
TSV group Current loads Tr Vtolerance
For example, if we want to plan the third power TSV group, which is the global supply source of front-end circuits, the parameters are replaced with Isupply=200mA and Tr=0.05ns into the equation (6.1). Since analog circuitry is much sensitive to coupling noise than digital circuitry, the tolerant voltage of front-end circuit is set 50mV whereas the tolerant voltages of the other digital circuits are set 100mV. By
multiple iterations with equation (6.1), the optimized pair numbers and diameter of the power TSV group used for front-end circuits were chosen as six pairs and 10μm.
Similar results can be obtained with the major TSV group and the TSV group used for DRAM peripheral circuits. Note that, although the current load of DRAM peripheral circuits is 2.5 times than front-end circuits, the pair numbers of second power TSV group is still less than that of third power TSV group. This is because the switching time has more dominance in the supply noise than total current loads.
Once the pair numbers and diameter of each power TSV group has been decided, the three power groups are applied to the processor memory stack structure. Since the architecture is symmetrical with respect to the x-axis, we reserve the power TSV regions either at both sides or in the middle of each tier for TSV placement.
Considering the floorplan of the architecture, the power TSV group used for DRAM peripheral circuits is placed in the middle of this stack. On the other hand, the power TSV group used for front-end circuits is divided into two bundles. And these two bundles are placed at both sides of this stack. Finally, the major power TSV group is evenly distributed within the remainder power TSV regions.
6.4 Simulation Results
The 3D heterogeneous integration of processor memory stack is simulated based on UMC 65nm CMOS technology and the TSV model [6.12]. The nominal supply voltage for processor, SRAM blocks, and DRAM array cells is 1.0V. Moreover, the nominal supply voltages for DRAM peripheral circuits and front-end circuits are 1.35V and 1.2V. There are 8 power pins surrounding this structure. 8, 4, and 2 power pins are connecting external 1.0V, 1.35V and 1.2V power supply, respectively. Four of 1.0V power pins are located at each side and the other four 1.0V power pins are located in the middle of the stack. On the other hand, both the four 1.35V power pins
are located in the middle of the stack. And the 1.2V power pins are located at both sides with one power pin each side.
The footprint of this structure is assumed to be 2mm x 2mm. And the width and the pitch of the local power lines are 10μm and 100μm. Thus, the resolution of the process memory stack is 20 x 20 of each layer. By evenly distributing the current profiling models to the corresponding power grids, we can capture the effect of supply noise on the power delivery network to mimic the actual switching circuits.
To support such a multiple voltage domain system, the techniques presented in Chapter 3, Chapter 4, and Chapter 5 are combined together to be a hierarchical power delivery system for delivering low-noise, and well-controlled power supplies to the 3D processor memory stack. Based on the power design flow discussed in chapter 3, the noise reduction techniques are utilized into the stack sequentially.
The active DECAPs are used to reduce resonant noises caused by packages and TSVs at the first step. Fig. 6.11 shows the simulation waveforms of power supplies use active DECAPs and that without active DECAPs. By using two active DECAPs for each power VDD pin on the bottom layer, the total noise of 1.0V power pair (VDD and GND) reduced from 75.86mV to 63.52mV. Similarly, the total noise of 1.35V power pair reduced from 75.62mV to 66.07mV and those of 1.2V power pair reduced from 21.81mV to 15.17mV. Here, the noise means the root mean square (RMS) voltage of the voltage difference between VDD/GND and its nominal voltage.
For example, the supply noise on a nominal 1.0V power supply is calculated as
|VDDi-1.0|RMS. Thus, the total noise of 1.0V power pair is calculated as |VDDi-1.0|RMS + |GNDi|RMS.
VDD_1.0V (Processor)
GND (Processor)
VDD_1.35V (DRAM peri.)
GND (DRAM peri.)
VDD_1.2V (Front-End)
GND (Front-End)
|VDD-1.2|RMS + |GND|RMS = 21.81mV/15.17mV
|VDD-1.35|RMS + |GND|RMS = 75.62mV/66.07mV
|VDD-1.0|RMS + |GND|RMS = 75.86mV/63.52mV
Fig. 6.11. Simulation waveforms of voltage performance while using active DECAPs (the blue line) and that without active DECAPs (the grey line).
Moreover, the global and the local power networks are decoupled by adaptively biased voltage regulators. Each power domain is powered by a dedicated voltage regulator with the requested voltage. According to the workloads on the local power networks, there are 8, 4, 8, and 2 voltage regulators used on layer 1 to layer 4, respectively. Since the output voltage of voltage regulator is locked to reference voltage through an error amplifier, the power supplies provided by local voltage regulators have more stable voltage performance than those connected to power TSVs directly, as shown in Fig. 6.12. The noise reductions of 1.0V, 1.35V, and 1.0V power supply are 75.71%, 53.76%, and 76.06%, respectively.
Additionally, the active substrate decouplers (ASDs) are used to suppress the substrate noises and coupling noises in the 3D structure. According to the simulation results in chapter 5, the ASDs are suggested to be distributed on the noise propagation path to have better noise reduction. Thus, 8 ASDs are distributed around the ground TSVs in each layer to reduce more coupling noises. Fig. 6.13 shows the simulation
waveforms of three ground supplies. By absorbing the substrate noise current and virtually shorting to reference ground, ASDs keep the ground supplies quiescent. The noise reductions of ground supplies for processor, DRAM peripheral circuits, and front-end circuits are 59.05%, 22.89%, and 50.40%, respectively.
VDD_1.0V (Processor)
VDD_1.35V (DRAM peri.)
VDD_1.2V (Front-End)
50ns 100ns 150ns 200ns
|VDD-1.0|RMS = 34.01mV/8.26mV
|VDD-1.35|RMS = 40.23mV/18.6mV
|VDD-1.2|RMS = 10.78mV/ 2.58mV
Fig. 6.12. Simulation waveforms of voltage performance while using local voltage regulators (the blue line) and that connecting to power TSV directly (the grey line).
GND (Processor)
GND (DRAM peri.)
GND (Front-End)
50ns 100ns 150ns 200ns
|GND|RMS = 33.46mV/ 13.70mV
|GND|RMS = 25.73mV/ 19.84mV
|GND|RMS = 6.35mV/ 3.15mV
Fig. 6.13. Simulation waveforms of voltage performance while using ASDs to suppress substrate noises (the blue line) and that without ASDs (the grey line).
The overall comparison of voltage performances using the hierarchical power delivery system and those without the hierarchical power delivery system is show in Fig. 6.14. Thanks to the voltage regulators and ASDs, the supply voltages and the ground voltages become more reliable with less voltage fluctuations. The noise on 1.0V, 1.35V, and 1.2V power supply pairs are reduced by 70.51%, 45.71%, and 71.10%.
VDD_1.0V (Processor)
GND (Processor)
VDD_1.35V (DRAM peri.)
GND (DRAM peri.)
VDD_1.2V (Front-End)
GND (Front-End)
50ns 100ns 150ns 200ns
|VDD-1.2|RMS + |GND|RMS = 18.69mV/5.4mV
|VDD-1.35|RMS + |GND|RMS = 69.91mV/37.95mV
|VDD-1.0|RMS + |GND|RMS = 73.59mV/21.70mV
Fig. 6.14. Simulation waveforms of voltage performance while using the hierarchical power delivery system (the blue line) and that without hierarchical power delivery system (the grey line).
Fig. 6.15 shows the noise reductions of using hierarchical power delivery system step by step. While only the active DECAPs are used, the noise reductions of 1.0V, 1.35V, and 1.2V power supply pairs are 16.26%, 12.62%, and 30.44%, respectively.
However, when the voltage regulator are introduced to the 3D structure, the noise reductions of 1.0V, 1.35V, and 1.2V power supply pairs are greatly improved by 47.38%, 41.29%, and 60.28%, respectively. Moreover, when the ASDs are adopted
(a)
Fig. 6.15. Noise reductions of each power supply pair while (a) with active DECAPs only, (b) with active DECAPs and voltage regulators, (c) with active DECAPs, voltage regulators, and ASDs.
into the 3D structure, the noise reductions of 1.0V, 1.35V, and 1.2V power supply pairs are further improved by 70.52%, 45.71%, and 71.10%, respectively. For fair comparison, the extra decoupling capacitance used in active DECAPs, voltage regulators, and ASDs are added to those compared simulations with equivalent capacitance. Take Fig. 6.15(a) as an example, if the active DECAPs are removed from the process-memory stack, an equivalent capacitance of 400pF are filled into the remaining area for experimental control. Similarly, voltage regulators are replaced by a 100pF~200pF size of equivalent capacitance (depending on the total amount of decoupling capacitance used in each layer) while voltage regulators are removed. And ASDs are replaced by a 60pF size of equivalent capacitance in each layer while ASDs are removed. All the equivalent capacitances evenly distributed within the vacant area for experimental control. As a result, the voltage regulators have the greatest effect on noise reduction in the hierarchical power delivery system.
0.966
Fig. 6.16. Effective supply voltages across the 3D structure. (a) the effective supply voltages for processor (1.0V), and (b) the effective supply voltages for front-end circuits (1.2V)
Furthermore, Fig. 6.16 shows the effective supply voltages (VDDmin - GNDmax) for power distribution networks across the 3D structure while using the hierarchical power delivery system. For simplicity, only the simulation data of the bottom and the top layer are shown. The effective supply voltages across the processor tier are in the range of 0.968V~0.976V, as shown in Fig. 6.16(a). The maximum voltage difference is only 8mV. On the other hand, the effective supply voltages across the front-end circuit tier are in the range of 1.185V~1.188V, as shown in Fig. 6.16(b). The maximum voltage difference is only 3mV. Here, the magnitude of effective supply voltages is affected by the location of switching circuits and local voltage regulators.
The node which is more close to local voltage regulators has less voltage drop.
Fig. 6.17 shows the power overhead breakdown of each power component. The total power overhead is 41.27mW, where active DECAPs consume 7.99mW, ASDs consume 13.17mW, and adaptively biased regulators consume 20.11mW. And the power overhead is only 1.11% of total power consumption (3.7W) of the processor memory stack.
41.27mW (1.11%)
20.11mW 7.99mW
13.17mW
adaptively biased regulators ASDs
active DECAPs
Fig. 6.17. Power overhead breakdown of each power component.
6.5 Summary
In this chapter, a case study of power integrity for 3D heterogeneous integrations is analyzed. The heterogeneous integration is assumed to be a processors memory stack and simulated by current profiling models. To support such a multiple supply voltages system, the techniques presented in Chapter 3, Chapter 4, and Chapter 5 are combined together to be a hierarchical power delivery system for delivering multiple, low-noise, and well-controlled power supplies to the 3D heterogeneous integration.
Furthermore, power integrity based on the proposed hierarchical power delivery system and that on general power delivery structure are compared step by step. As a result, the voltage regulators have the greatest effect on noise reduction in the hierarchical power delivery system. And the noise reductions on power supply pairs (VDD+GND) are suppressed by up to 71.10%. Moreover, with an appropriate power delivery structure, the voltage difference is only 8mV within the entire processor tier.
The power overhead of the hierarchical power delivery system is 1.11% of the whole 3D processor memory stack.
Chapter 7
Conclusion and Future Work
7.1 Conclusion
Three-dimensional (3D) integration technology can provide enormous advantages in achieving multi-functional integration, microminiaturizing form factor, improving system speed and reducing power consumption for future generations of ICs. However, stacking multiple dies would face a severe challenge of power integrity due to the increasing current density and parasitic impedance in TSV 3D-ICs.
Moreover, system heterogeneity offered by 3-D circuits has exacerbated the requirement for multiple, wide range, and well-controlled power supplies. In view of these, a hierarchical power delivery system is presented for the power integrity in TSV 3D-ICs.
The proposed hierarchical power delivery system decouples the global and local power networks by voltage regulator modules. The decoupled power delivery structure can reduce the required decoupling capacitors significantly. In addition, an area-efficient TSV planning for choosing appropriate diameter and counts of power TSVs is proposed to optimize the area-occupancy and voltage drop performance.
In order to reduce the resonant noise caused by the package and power TSVs, an active switching DECAP is adopted in the hierarchical power delivery system as the global regulator. Furthermore, a wide bandwidth linear voltage regulator with adaptive biasing technique is proposed to achieve the wide operation frequency range.
This adaptively biased regulator enhances the transient response by increasing the bias current in heavy load, while keeps low quiescent current to maintain high current
efficiency in light load. To further exploit the voltage fluctuations in the entire system, the placements of the voltage regulator modules and the sizes of power delivery grids are also introduced.
Consequently, a substrate noise suppression technique is also presented for TSV 3D-ICs by considering both substrate and TSV coupling noises. This substrate noise suppression technique reduces noises using ASDs that utilizes a decoupling capacitor to absorb the substrate noise current. For further achieving effective noise reduction, the ASD placing is also presented for different 3D structures.
A case study for the power integrity of the heterogeneous TSV 3D integration is also investigated in this thesis. As a result, the noise reduction of the case study based on the proposed hierarchical power delivery system is greatly reduced by up to 71.10% with only 1.11% power overhead. Accordingly, the hierarchical power delivery system can be easily adopted in a heterogeneous TSV 3D integration with a little modification of the local power networks. Therefore, the proposed hierarchical power delivery system is very useful for the power integrity of the heterogeneous integration in TSV 3D-ICs.
7.2 Future Work
System heterogeneity offered by 3D integration usually requires different supply voltages for different function blocks, ranging from high (3.3V or higher) to ultra-low (sub-threshold operation) voltages. The multiple voltages requirement can be achieved by adopting the proposed hierarchical power delivery structure. As shown in Fig. 7.1, the first layer power TSVs are connected to power source, supplying a high voltage. The clean high voltage is then provided to the high voltage domain through a voltage regulator. Because of the inherent power efficiency limit, the linear regulator
is not suitable for large voltage conversion ratio. An on-chip switching DC-DC converter is a better option [7.1]. The switching DC-DC buck converter can be positioned at the second layer of the power hierarchy as shown in Fig. 7.1 to produce a lower voltage for further usage. The converted low voltage is then fed to low voltage domains. For ultra-low voltage (sub-threshold) domains, switched capacitor DC-DC converters [7.2] can be adopted. By utilizing a combination of linear, switching buck converters and switched capacitor DC-DC converters, high power efficiency of heterogeneous integration is achieved over a wide range of conversion ratios.
Fig. 7.1. Hierarchical power delivery system for wide voltage range heterogeneous integrations.
In addition to power integrity, the heat dissipation is also an extreme challenge in 3D-ICs due to the increased power density and poor thermal conductivity between bonding materials. The excessively high temperature can significantly degrade
interconnect/device reliability and performance. In order to control the hot spot issue, temperature sensors can be integrated into the 3D integration. With the thermal feedback information of temperature sensors, the temperature of the circuit will be controlled below a safety upper bound by slowing the system operating frequency down. Therefore, such a temperature-power management can be adopted for TSV 3D-ICs, as shown in Fig. 7.2.
Temperature Sensor
Voltage Regulator
Clock Generator Operation
Table System Workload
Fig. 7.2. Temperature-power management diagram.
References
Chapter 1
[1.1] Yole Development. (2007). 3DIC & TSV Report [Online].
http://www.yole.fr/pagesan/products/reprot/sample/3dic.pdf
[1.2] W. Davis, J. Wilson, S. Mick, J. Xu, H. Hua, C. Mineo, A. Sule, M. Steer, and P. Franzon, ―Demystifying 3-D ICs: The pros and cons of going vertical,‖
IEEE Design & Test of Computers, vol. 22, no. 6, pp. 498-510, Nov. 2005.
[1.3] N. H. Khan, S. M. Alam, and S. Hassoun, ―Power delivery design for 3-D ICs using different TSV technologies,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 19, no. 4, pp. 647-658, April 2011.
[1.4] P. Jain, D. Jiao, X. Wang, and C. H. Kim, ―Measurement, analysis and improvement of supply noise in 3D ICs,” accepted by IEEE VLSI Circuits Symposium, 2011.
[1.5] X. Meng and R. Saleh, ―An Improved Active Decoupling Capacitor for Hot-Spot Supply Noise Reduction in ASIC Designs,‖ IEEE Journal of Solid-State Circuits, vol. 44, no. 2, pp. 584-593, Feb. 2009.
Chapter 2
[2.1] E. Beyne, ―The rise of the 3rd dimension for system integration,‖ IEEE International Interconnect Technology Conference, 2006, pp.1-5.
[2.2] R. R. Tummala, V. Sundaram, R. Chatterjee, P.M. Raj, N. Kumbhat, V.
Sukumaran, V. Sridharan, A. Choudury, Q. Chen, and T. Bandyopadhyay,
―Trend from ICs to 3D ICs to 3D Systems,‖ in Proc. IEEE Conf.
Custom Integrated Circuits Conference, pp. 439-444, Sept. 2009.
[2.3] Yole Development. (2008). 3DIC & TSV Report [Online].
http://www.yole.fr/pagesan/products/report_sample/3dic.pdf
[2.4] M. S. Bakir, C. King, D. Sekar, H. Thacker, B. Dan, G. Huang, A. Naeemi, and J. D. Meindl, ―3D heterogeneous integrated systems: liquid cooling, power delivery, and implementation,‖ in Proc. IEEE Conf.
Custom Integrated Circuits Conference, 2008, pp. 663-670.
[2.5] T. Whipple, T. Kukal, K. Felton, and V. Gerousis, ―IC-package co-design and analysis for 3D-IC designs,‖ IEEE International Conference on 3D System Integration, 2009, pp. 1-6.
[2.6] V.F. Pavlidis and E.G. Friedman, ‖Interconnect-based design methodologies for three-dimensional integrated circuits,‖ in Proceedings of the IEEE, vol. 97 no. 1, pp. 123-140, Jan. 2009.
[2.7] M. Motoyoshi, ―Through-silicon via (TSV),‖ in Proceedings of the IEEE, vol.
97, no. 1, pp.43-48, Jan. 2009
[2.8] M. Koyanagi, T. Fukushima, and T. Tanaka, ―High-density through silicon vias for 3-D LSIs,‖ in Proceedings of the IEEE, vol. 97 no. 1, pp. 49-59, Jan.
2009.
[2.9] P. Marchal, B. Bougard, G. Katti, M. Stucchi, W. Dehaene, A. Papanikolaou,
D. Verkest, B. Swinnen, and E. Beyne, ―3-D technology assessment:
path-finding the technology/design sweet-spot,‖ in Proceedings of the IEEE, vol. 97, no. 1, pp. 96-107, Jan. 2009.
[2.10] K. N. Chen, and C. S. Tan, ―Integration schemes and enabling technologies for three-dimensional integrated circuits,‖ IET Computers & Digital Techniques, vol. 5, no. 3, pp.160-168, May 2011.
[2.11] G. Van der Plas et al., ―Design issues and considerations for low-cost 3-D TSV IC technology,‖ IEEE Journal of Solid-State Circuits, vol. 46, no. 1, pp.
293-307, Jan. 2011.
[2.12] J.-F. Li and C.-W. Wu, ―Is 3D integration an opportunity or just a hype,‖ Asia and South Pacific Design Automation Conference (ASP-DAC), 2010, pp.
541-543.
[2.13] T. Zhang, R. Micheloni, G. Zhang, Z.-R. Huang, and J. J. Lu, ―3-D data storage, power delivery, and RF/optical transceiver-case studies of 3-D integration from system design perspectives‖ in Proceedings of the IEEE, vol.
97, no. 1, pp. 161-174, Jan. 2009.
[2.14] Semiconductor Industry Association. International technology roadmap for semiconductors (ITRS), 2004. http://public.itrs.net/.
[2.15] J. Sun, J. Lu, D. Giuliano, T. P. Chow, and R. J. Gutmann, ―3D power delivery for microprocessors and high-performance ASICs.‖ in IEEE Applied Power Electronics Conference (APEC), Feb. 2007, pp. 127-133.
[2.16] N. Na, T. Budell, C. Chiu, E. Tremble, and I. Wernple, ―The effects of on-chip and package decoupling capacitors and an efficient ASIC decoupling methodology.‖ in Proceedings of the IEEE Electronic Components and Technology Conference, 2007, pp. 556–567.
[2.17] E. Hailu, D. Boerstler, K. Miki, J. Qi, M. Wang, and M. Riley, ―A circuit for
[2.17] E. Hailu, D. Boerstler, K. Miki, J. Qi, M. Wang, and M. Riley, ―A circuit for