118-AddCmp-Asilomar

March 30, 2018 | Author: Mithunesh Sadanantham | Category: Computer Engineering, Electronic Design, Digital & Social Media, Digital Technology, Digital Electronics


Comments



Description

35th Asilomar Conference on Signals, Systems and Computers Pacific Grove, California, November 4-7, 2001Application of Logical Effort Techniques for Speed Optimization and Analysis of Representative Adders* Hoang Dao, Vojin G. Oklobdzija ACSEL Lab Department of Electrical and Computer Engineering University of California, Davis 95616 (hqdao,vojin)@ece.ucdavis.edu Abstract This paper presents the transistor-level analysis of contemporary 64-bit adders. The logical effort technique was applied to provide more descriptive presentation of the delay and circuit architecture. It also enabled optimization of gate size for optimal performance. The selected adders were dynamic carry-lookahead adder (DCLA), static carry-select adder (SCSA), dynamic Kogge-Stone adder (DKSA) and Ling/conditional-sum adder (DLCNSA). The results matched well with simulation using 0.18µm, 1.8V CMOS. Adders with fewer levels in the critical path showed superior performance. In particular, for dynamic adders, a 0.6-FO4 per-gate delay improvement was observed. carry-lookahead adder, modified static carry-select adder [2], dynamic Kogge-Stone adder [3] and dynamic Ling/ conditional-sum adder [4] were chosen. 2. Optimization conditions All adders were optimized under the following conditions: maximum input size of 20µm, maximal allowable transistor size of 20µm and an identical load of 30µm-equivalent inverter. These conditions were set to get reasonable transistor sizes for layout and an acceptable load to the adder. The adders were optimized according to the critical paths, estimated from the adder architecture. Delay effort in other paths was derived from the critical one. The optimization process was done recursively until all transistor sizes converged. The wiring capacitance was included and estimated from the width of the most congested cell. It was measured from the preliminary layout of the cell. This width was kept minimal to reduce the impact of the wiring capacitance. 1. Introduction Adder delay is critical in the design of highperformance processor. Unfortunately, it is normally presented in terms of gate delays or simulation result. The former format is not efficient because delay is dependent on gate types and the number of inputs; the latter does not help to relate the result to the adder architecture and is difficult to compare. In our analysis, the logical effort method is used to express the delay. The logical effort (LE) analysis [1] is efficient in delay estimation and connects delay to adder architecture. It models the gate delay using gate characteristics and its loading and compares the gate delay to τ, the delay of a parasitic-free fanout-1 (FO1) inverter. This delay is normally known by designers for a given technology. So, the delay estimation by logical effort can be fairly accurate. Furthermore, it also accounts for the effect of circuit architectures in delay, via branching and gate loading. For analysis four contemporary adders: dynamic 3. Logical effort analysis of adders The SCSA had 9 gates, all static (Figs. 2-3). The group-2 generate/propagate structure was chosen. Unlike [2], the complementary propagates were used to avoid extra inverter delays in critical path. Both critical paths were to the sum MSB, via either the generate path from the operand LSB or via the propagate path from the operand 35th LSB. An extra inverter was added in both paths during the optimization to reduce the gate size and enable less optimal delay effort. The per-gate effort delay was 3.7τ. The DCLA had 14 gates: 7 dynamic and 7 high-skew (Figs. 4-5). (High-skew gates have faster pull-up transitions than the pull-down ones; reverse is true for low-skew gates.) The Group-4 propagate and generate * Supported under SRC Contract No. 2001-HJ-931 November 2001. Systems. Farooqui.0 5. 1 lo-sk) 14 (7 dyn. HSPICE 2. which corresponded to 0. Lee and V. but not proportional to. F. 1999. V.0 11.5 2. Silberman. [3] J. Ngo. Results The results were confirmed using HSPICE simulation for the 0. 8-9). B.4τ.5 0. pp. 3 hi-sk) 9 (4 dyn.28 868 1064 1.80 1. H. 3. S.60 FO4 or 2.7 # FO4 6. “Logical Effort: Designing Fast CMOS Circuits. Normalized delay Table 1 summarized the result of the logical effort analysis.5µm 64b Adder Design. Smith. and Computers. The pergate effort delay was 2. pp. “MultiplexerBased Adder for Media Signal Processing. There were two possible critical paths to the sum MSB: the generate path via the operand LSB and the propagate path via the operand 6th LSB. Conclusion The comparison of adder performance was presented. C." Journal of VLSI Signal Processing. normalized to the delay of the DKSA. 4.00 702 742 1. Table 2. 1991.362-363.3 60. as shown in Fig. It used redundant scheme for Group-4 propagates/generates.4 8. The DLCNSA had 9 gates: 4 dynamic.1 12. A. Sutherland. 2000. No. D. the delay difference was close to 0. V. pp. 265-274. 1996. and Application.2 9. Xu. 4. c1999. 7 hi-sk) 9 (all static) LE (tau) 30.9τ.3-12.100-103.4τ. The adders with fewer stages yielded smaller delay. The per-gate effort delay was 2.” Symposium on VLSI Circuits Digest of Technical Papers. Harris.” ISSCC Digest of Technical Papers.” Morgan Kaufmann Publisher. D. 4 hi-sk.4 18. The delays were in term of the parasitic-free FO1 delay τ and FO4 delay of an inverter.0 LE (norm) SPICE (norm) S-CSA D-CLA 1. References: [1] I.0 0 2 4 6 8 10 12 14 16 # Gates Fig. [5] A. Naffziger. It combined the Ling pseudocarry/propagate to generate long carries and the conditional sum for local carries. Vol. NORMALIZED DELAY: LE vs. The adder performance was dependent on.00 1. Oklobdzija. 192-193. "Improved CLA Scheme With Optimized Delay. 1. [2] A.4 5.” National Bureau of Standard Circulars 591. 6-7). The critical path was the path through the long carry to the sum MSB. Systems.” International Symposium on VLSI Technology. [7] X. 1958. pp.83 971 1059 2. Comparison with the simulation result 64-b Adder D-KSA D-LCNSA D-CLA S-CSA LE (ps) LE (norm) SPICE (ps) SPICE (norm) 482 581 1.82 Err (%) 17. Oklobdzija. In addition. The critical path was from the operand LSB via the generate path to the sum MSB. The worst-case delay of the critical path was measured. G. Chehrazi.1 43. the calculated delays did not clearly show a linear relationship to the number of gates in the critical path.46 1. pp. Table 1. J.10x per gate. [4] S. . Logical effort analysis was done assuming both paths were equal.5 D-LCNSA 1. G. “A Logic for High-Speed Addition. The DKSA had 6 gates: 3 dynamic and 3 high-skew (Figs.01 1. Park. “470ps 64-Bit Parallel Binary Adder. G. for dynamic adders. “Application of Logical Effort on Design of Arithmetic Blocks in VLSI CMOS Technology.3τ.0 D-KSA 0. Sproull. 4 high-skew and 1 low-skew (Figs. The result was presented in Table 2. The static adder also appeared to be 1. “A Sub-Nanosecond 0.3 The HSPICE results were consistent with the logical effort analysis. 1.” 35th Asilomar Conference on Signals.8V CMOS technology at room temperature. Oklobdzija. The circuit optimization was done using logical effort technique. Logical effort delay 64-b Adder Dynamic Kogge Dynamic Ling Dynamic CLA Static CSA # Gates 6 (3 dyn. H. J. Weinberger.Delay (Normalized to DKSA) scheme was used. Dhong. In addition. the number of gates in the critical path. The resulting per-gate effort delay was 1. [6] B.18µm.5x worse in delay. 1.9 54. Carry-select adder: circuit Fig. 4. 5. 2. Carry-lookahead adder: circuits b47 b32 b31 b16 PGC PGC C 40 C44 PGC PGC PGC PGC C24 PGC PGC C36 C28 C 20 PGC C32 PGC C16 b63 b48 b15 b0 PGC PGC C 56 C60 PGC PGC PGC PGC C8 PGC PGC C52 C 12 C4 PGC C 48 PGC C in = C 0 C Fig. Carry-lookahead adder: diagram . Carry-select adder: diagram CK A B CK G AN B CK A BN P CK AN BN K CK p3 p2 p1 p0 P4 CK p3 p2 p1 g0 G4 CK p2 C1 C2 C3 CK Sum CiN g3 g2 g1 p1 p0 Ci g0 g1 g2 g p STB Fig. 3.Fig. . Kogge-Stone adder: circuits 63 62 61 60 59 48 32 16 12 .. 9. . ... 8.. 6. 7.. 8 7 6 5 4 3 2 1 0 G4 P4 G16 P16 Co Sum Fig.. Ling/conditional-sum adder: diagram .CK A3 B3 A2 B2 A1 B1 A0 B0 B1 A3 B3 A3 A2 B3 B2 G4 CK A3 B3 CK A3 B3 A2 A3 B2 B3 P4 A2 B2 A1 B1 A1 A0 B1 B0 CK G3 CK HSN G3 G2 G1 G0 P1 A1 P2 P3 HS Sum STB Fig. Kogge-Stone adder: diagram CK A2 A2 B2 A1 B1 G3 B2 A1 A0 B0 B1 G4 CK A3 B3 CK A1 A0 B1 A3 B0 A2 B3 B2 P4 CK G0 P1 P2 CK G1 LC CK SumL LCH LCL C0H P G2 K G C1L C1H C0L C1H C1L C0H LCH LCL C0L SumH CK Fig..... Ling/conditional-sum adder: circuits Quadrant Pseudo-Carry Quadrant Propagate 4-b Pseudo-Carry 4-b Propagate 1-b Propagate Operands 1-b Kill 1-b Generate Dual Local Carry Long Carry Final Sum Result Fig. .
Copyright © 2024 DOKUMEN.SITE Inc.