PD IntroductionWhat is PD? • Mechanical activity towards preparation of artwork for the mask preparation • Electrical activity towards making the design meets its constraints at silicon level Knowledge requirements for PD • • • • • • • • Semiconductor knowledge (minimal) SoC architecture (the more the better) SoC design (knowledge on constraints) Technology library (mandatory, don’t play with strange things) Methodology (good understanding on why this way and why not the other way) EDA tool (mandatory to work hands-on) Scripting (to reduce pain) System design (to understand the application requirement) Full-custom Vs Semi-custom – Integrity Full custom – Area Full custom – Power Full custom – Performance Full custom – Yield Full custom – Cost Large volume full custom – Development cycle semi custom – Development cost semi custom – Reusability semi custom Myth: All digital design are Semi custom and analog designs are full custom. Full custom Inverter Abstract view of the inverter used for semi custom PD Project cycle ASIC / SoC requirement feasibility plan Implementation verification tapeout fabrication package assembly tester characterization validation respin (if required) ASIC / SoC PD requirements At top level •Size •Power •Speed •Cost (equally important) At lower down Many more which attribute to the above parameters PD Feasibility analysis At introductory level, • Feasibility analysis is needed to avoid surprises halfway into the design • For better planning and prediction of time, resource and success rate upfront Scope of feasibility analysis: • Can the constraints be met • Can the chip be designed • Will the chip work • What are the risks involved Will revisit feasibility analysis in detail PD Plan • Project management comprises of – Requirement to meet – Time to complete – Cost for the project including resource, tool • Requirements can be, – – – – – – RTL GDS / Netlist GDS DFT : BS/ Internal scan/ MBIST/ LBIST Macros: PLLs, memories, spl. IOs & cells Methodology: technology related, SI, EM/IR, Design based: Split power, multi Vt Package: BGA/ QFP/ Flipchip, std / custom Will revisit planning in detail. PD Implementation Full chip/ subchip/ top level/ chip closure ? (die-size) (die-shape) (timing) (IR) (EM) (powerplan) (floorplan) (IO) (CTS) (routing) (SI) (rail) (DRC/LVS) (STA) (dyn. Sim) (XRC) (sign-off) (tapeout) (metal slot) (metal fill) (PO/OD fill) (GDSII) (seal & scribe) (constraints analysis) What more? Where to start & where to end? PD verification • What to verify? – Will the chip work? – Will the chip perform? – Will the chip be reliable? – Can the chip be manufactured? – Will all the functions required be met? – Will the chip be testable? PD design closure • What is design closure? – Timing closure – Power closure – DRC/LVS cleanup – SI closure PD chip closure What is chip closure? – Dummy metal fill – Poly / Oxide fill – Wirebond check – Seal ring and seal scribe – Stress release pattern – Scribe and scribe lane Ready to upload the GDSII to fab PD tools What for PD tools? –Implementation –Analysis –Verification Major tool vendors –Cadence –Synopsys –Mentor graphics –Magma Tools are task simplifiers. They are not problem solvers, instead they can also be source for problem PD Methodology • What is methodology? • What is flow ? • Need for flow flush PD Requirement analysis Customer Input (General input) Name Of the Design Application domain New Design / Respin Design If it is Respin Design a. Reasons for failure of the earlier Design b. Nature of Enhancements on Respin Design Synthesis tool used Design Tool & Sign off Tool to be used a. Physical design b. Physical verification c. STA d. SI e. Power analysis Targeted package Name of the Foundry Data input for the Design (RTL/NETLIST/GDS –II/ECO) a. RTL b. Netlist with DFT c. Netlist without DFT d. Optimzation f. IO list and pad order Customer input (general input) … Formal Verification Reports Provision of test setup for physical design Time Frame for the design Flat or Hierarchical Design On site/ Off site design Sign-off criteria Technical point of contact (name / telephone) Customer input (design details) Design Details Explanation of Design including Architecture Estimated No. Of Logic Gates in the design Total gate count of the design. No. of IO pins in the Design Max I/P Frequency of the Design Max O/P Frequency of the Design Max Internal Frequency of the Design No. of clocks in the Design High Fan-out Nets Defined that require Buffer Tree Synthesis Gated clock used or not Synchronous Reset / Asynchronous Reset used in the Design Any Latch used in the Design (specify) Wirebond or flip chip Timing Report of Synthesis Design Constraint Format Tool used for dumping the SDC Usage if thru' - thru' constraints in SDC Need for recovery / removal Customer input Library Details Name of the Library Vendor Which process technology to be used (for e.g. 0.18u,0.13u) Target library to be used (for e.g. LV/Generic/Multi Vt Library) Core and IO Voltage to be used No. of Metal Layers to be used and Routing Guidelines Type of pads to be used Customer input Macro Details a. SP RAMs (size, mux factor, no.of instances) Analog IP Block used (Provide details) LEF, DEF, .lib, .db, Constraints and GDSII for hard Macros b. DP RAMs ( size, mux factor, no.of instances) Onchip Memory Requirements (Provide details) c. Register files (size, mux factor, no.of instances) Soft Macros List Intended package Customer inputs Floorplan Details Die size requirement Die aspect ratio Floor plan details with PAD order and location Inline / staggered pad OR flip chip design Floorplan Guide lines (Any specific requirements) Data flow diagram Intended package Power Plan Details SSOs in the IOs Customer input Estimated core power Estimated IO power IO types and voltages Need for multi Vt Anticipated total power dissipation Need for multiple power domain Need for clock gating Customer input Clock Tree Synthesis Details Propogated clock details Clock domains with multicycle paths Bugeted clock duty cycle variation Budgeted clock jitter margin clock tree diagram for the design Inter & intra clock phase matching constraints Clock domains with half cycle paths Budgeted Clock Skew in the Design No. of clocks in the design Required setup margin Required setup margin Target insertion delay Clock gating details clock details Derived clocks details Clock frequencies DFT input JTAG pin list (Muxed / non muxed) No. of scan chains Length of each scan chain No. of MBIST controllers MBIST controller details (groupings) Max scan clock frequency Estimated switching activity during scan test Clock gating details Test SDC Common functional & test mode paths with constraints Size of DFT logic a. MBIST logic b. LBIST logic c. JTAG controller Scan chain report with pipe line logic for scan tracing Required set-up margin Required hold margin Technology input Runsets for the sign-off tool Standard cell library- full viewSynopsys Milkyway a. DRC b. LVS Design Rules for the technology / process c. Antenna d. Metal slotting e. Metal filling f. Poly & Oxide filling g. wire bond DRC h. Sample GDSII of seal ring with CSR IO pad library Library for special IO pads Bond pads De-cap and endcap cells Macros Filler cells and pad fillers Points to be gathered from tech files From the IO pad library From std cell library a. number of library entities b. clock buffers and inverters c. rise/ fall time match in the clock components. d. Gate gensity for the technology e. dc and ac characteristics of the standard cells f. power rating of the standard cells g. noise margins h. characteristics at min-max conditions i. Intrinsic delay statistics non seq. components j. Clock to Q delays for the FF k. max fanout, max trans and max cap parameters a. types of IO pad entities b. operating voltage c. operating current d. Dimension e. pitch of the pads f. pad delay characteristics From Design Rule a. current density of different layers b. Sheet resistance c. electrical rules d. Poly and oxide related rules f. dummy metal filling g. Metal slotting h. wire bond rules i. Scribe and seal ring Memory input SL .N o Memor y instance name Memory (type/size/mux factor) X dimen sion Y dimen sion M Area (um2) Bit cells Power * Cloc k1 Cloc k2 BIST group Checklist for memories Technology library of the memory tallies with the std cell library for the design .lib, .db, .tf, .itf, .tlf, FRAM and GDSII libraries are available The dimensions mentioned are correct Power numbers specified are correct Document on the data flow among the memory blocks are made available- for the purpose of floor planning BIST logic for the memory logic is incorporated Memeory data sheet to be referred for the layer details, power ring details and the placement and routing blockages for the memory macros Check with the foundry MT form for the required memory details and generate them upfront Check for the maximum dimensions and feasibility of fitting in the die area Macro input SL.N o Macro instance name Macro type X dimens ion Y dimens ion Area (um2) Logic gates Pow er No. of clocks Clk. Frequ. Checklist for macros Technology library of the memory tallies with the std cell libray for the design .lib, .db, .tf, .itf, .tlf, FRAM and GDSII libraries are available The dimensions mentioned are correct Power numbers specified are correct Document on the data flow among the macro blocks are made available- for the purpose of floor planning Macro data sheet to be referred for the layer details, power ring details and the placement and routing blockages for the memory macros Check for the maximum dimensions and feasibility of fitting in the die area In the layout viewer open the FRAM view and check for the connectivity pins In the GDS view check for the metal layer, blockages and the power plan used for the macro internally Check for the clocks, clock constraints, matching or latency requirements if any at the macro input Run the DRC, LVS, Antenna runsets on the macros in stand alone mode and check they pass the checks before going for full design IO input SL. No Signal Group Pad instance name Directi on SSO group Toggle rate pad order pin/ ball map Checklist for IO pads Check the type of IO pads used belong to the technology files used Check for the pad size and the minimum pitch to be used for the pads Check for special pads like Analog pads which normally may not be part of the free library Check for requirement of power cut diodes, power on control cells for the design Check for the type and availability of the bond pads to be used in the design (staggered/ in-line) Check for the pre driver and post driver power requirements Check for the power on sequence for the core and IO powers Check for the availability of proper power pads for the IO and core power Analyze the current surges that could occur due to simultaneous switching of the SSO group signals Check the availability of pad fillers in the technology library for forming the pad ring Clock input Sl. No Cloc k name F req ue nc y Cla ssific ation* Insert ion delay J Dut it Sk Fa t ew y cycl e limi nout e r t M ax tra ns Ma x Ca p U Sync. Points Interse of clock edge domain s Checklist for clock Obtain a clock tree diagram indicating the different logic blocks being driven by the clocks Obtain the number of sequential elements used in those blocks and estimate the clock tree size When the tree size is very large there is likely-hood of clock duty cycle distortion as well as large clock path delay, clock skew Check the design library is suitably edited with clock tree components separated from std. cell components For critical clocks with tight duty cycle requirement, make sure the rise time/ fall time matching of the clock buffers/ inverters Check for interclock domains and the corresponding false path declarations in the SDC Observe for MCP declarations in the SDC and the associated clock domains Check the MCPs are declared for both set-up and hold time, the MCP for hold is less by 1 to the MCP for set-up Check for recovery removal timing closure required for the clock domains and make note of the same for the CTS design flow Check if preset / clear paths also need timing closure and set the flow for the same Check for the test / scan clocks and their SDCs Analyze the common paths for functional and test paths and the impact of the MCP declarations on them In the derived clock, check for the Q to D feed-back path and the impact of hold time closure on the phase distortion Check if clock gating is used for any clock domain, appropriately set the CTS flow Check for propogated clocks in the design, accordingly set the flow for the respective CTS Check for async resets in the design Feasibility Analysis Technology information Floorplanning PD- Floorplanning • FP is the critical part in PD • High quality FP ensures accurate circuit timing & performance • Poor FP results in timing failure, routing congestion, larger power, larger area, huge IR drop and SI issues PD- Floor plan … • Floor plan involves decision on, – pin/pad location – hard macro placement – placement and routing blockage – location and area of the soft macros and its pin locations – number of power pads and its location. Floor plan tips • While fixing the location of the pin or pad always consider the surrounding environment with which the block or chip is interacting. This avoids routing congestion and also benefits in effective circuit timing • Provide sufficient number of power/ground pads on each side of the chip for effective power distribution. In deciding the number of power/ground pads, Power report and IR-drop in the design should also be considered Floor plan tips … • Flyline analysis should be done while placing the macros • Orientation of these macros forms an important part of floorplanning Floor plan tips … • Avoid spreading standard cells in several areas and creating small placement traps, with many pockets and isolated regions between the macros that can trap a standard cell and limit the routing access • A physical design engineer must focus on having homogeneous standard cell area with aligned macros Floor plan tips … • Create standard cell placement blockage at the corner of the macro because this part is more prone to routing congestion. • Also create standard cell placement blockage in long thin channel between macros • Avoid uneven routing resources in the design by using the proper aspect ratio (Width /Height) of the chip • For designs that have horizontal overflow, to increase utilization, cell row separation is increased which in turn helps increase horizontal routing resources Floor plan tips … • In hierarchical design, Cluster based implementation enables to place the standard cells of the given module in predefined region • Analog block are more susceptible to noise and signal routes going over such block cause signal integrity issues, routing blockages on all layers are to be defined for analog blocks • Time and efforts that are put in floorplanning save iterations and make design cycle faster More FP tips… • At any level, avoid routing that goes against the preferred routing direction for that level. • When creating metal rings around cores and blocks, remember to allow room for routing access to pins More FP tips … • When placing blocks, avoid creating four-way intersections in top-level channels • T intersections create much less congestion. This consideration can be critical to leaving the necessary space for routing channels, depending on how much over-the-cell routing is possible. • Using flylines can help determine optimized placement and orientation More FP tips • For placing block-level pins, – First determine the correct layer for the pins – Spread out the pins to reduce congestion. – Avoid placing pins in corners where routing access is limited – Use multiple pin layers for less congestion – Never place cells within the perimeter of hard macros. – To keep from blocking access to signal pins, avoid placing cells under power straps unless the straps are on metal layers higher than metal2 – Use density constraints or placement-blockage arrays to reduce congestion – Avoid creating any blockage that increases congestion. More FP tips … – Need to supply power and ground to areas where they might be useful for placing buffers or repeaters during the postplacement timing-convergence optimization and for top level buffers – Consider grouping multiple instances of any logical hierarchical element to form one hierarchical physical element. – Look for logical modules in the RTL design representation that can be grouped in hierarchical blocks – Also group small blocks into one larger block – It is easier to floorplan with same-sized blocks. Try to work with midsized blocks. A design partitioned in six to 12 roughly equivalent-sized blocks constitutes a reasonable candidate for floorplanning – Depending on the package design, you usually want to start the floorplan with I/Os at the periphery More tips on FP … • Consider parts of the design that are not typical standard cells: – memories – analog circuitry (PLLs) – logic that works with a double-speed clock – blocks that require a different voltage – exceptionally large blocks – unusual design-specific instances (flash) • place these elements first to ensure that their special needs are accommodated More tips on FP … • If two or more large blocks or other features that make a reasonable floorplan impossible, you may have to increase the die size or rearrange I/Os • If any of the large blocks are soft IP, repartitioning that block into smaller pieces • Arrange rest of the blocks in the remaining space based on their I/Os and power consumption • Avoid placing blocks that consume lot of power near center • For average libraries, the usage is around 70% • High percentage of registers or hard IP increases the percentage • Large numbers of multiplexers or other small, pin-dense cells decrease percentage • Run initial synthesis to find out how big the blocks are POWERPLAN Power planning Power plan guidelines • Core power – Based on the routing resources availability the butting of the std. cell rail to be decided – The core ring and mesh / strap widths and separation are planned based on the power estimate and the metal layer chosen for power network – Identify the high power macros and high speed blocks in the design – Enhance the power for the high power circuits by creating additional ring around them – Do a preplace power analysis and adjust the mesh widths, separation or additional meshes wherever larger IR drop is anticipated – Decide if placement blockage is necessary below the power network Power-plan guidelines … • Place filler cells before cell placement for the rails to get formed correctly • Do a power only DRC after completion of power plan before doing placement • Do metal slotting for the thick power network prior to placement • Plan for power cut diodes wherever isolation is required between two power domains (e.g. analog and digital) • Plan for power on sequence cells as required in the design Power-plan guidelines • Over design of power network would result in suboptimum die-size • Placement blockages under the power mesh would result in congestion • Select higher metal layers having higher current density for power network • Having higher metal layers for power would also act as heatsink • If power network is inadequate IR and EM violations are foreseen Powerplan guidelines… • Fill any open spaces with power-mesh metal. • Make sure the extra metal does not push signal wires closer together, thus increasing capacitance, powerconsumption, and signal-integrity problems. • Floorplanning and power planning constitute an integrated process Power planning guidelines… • If possible, metal width should be limited to avoid the need for metal slotting • Power and ground rings should be created around any hard macro to enable orientation independence and eliminate the need for the chip’s power structure to conform to the macro’s power structure • Once the power rings have been established, power and ground must be routed to the standard cell rows • The lowest horizontal metal layer should be used for these additional rails • Insert filler cells temporarily to get a complete grid. After insertion of the rails, the filler cells are removed • The rail spacing consistent with the standard cell height, but the designer must specify rail width • Straps and trunks distribute power across the chip and represent the most important means to address specific IR drop issues • Designers must determine the appropriate spacing, width, and layer of these straps and trunks • It is better to use many thin routes, rather than fewer wide routes, especially in the lowest metal layers, to improve overall routability. Placement PD Placement flows PD placement Placement guidelines • In order to meet the tight design specifications under schedule and design resource constraint, the decision whether to implement a specific block in a semi-custom flow or ASIC flow had to be made at the early definition stages. • Each datapath is first synthesized in a standard ASIC flow to check timing and power feasibility. • Only in cases where the results did not meet the design target it is needed to proceed to implement the block in the semi-custom design flow Placement… • It is easier to analyze and work in block level for complex designs even if hierarchical flow is not required • It helps in identifying the issues and bottlenecks at the block level • It eases the design by allowing constraint relaxations where the margins are available • Flat implementation gives better results than hierarchical • For block level analysis we need to generate block level constraints separately. Placement… • Even for flat designs, look for RTL hierarchy for grouping datapath • Where constraints are stringent, consider creating cluster groups and regions for their placement • For boundary scan cells create space closer to IO pads for the placement. This would help reducing routing congestion • Also for high fanout inputs consider buffering at the input pad Placement… • Do a Zero RC analysis upfront to check the consistency of the timing results with synthesis results • Perform a scan trace and compare with the DFT report for consistency of the scan chain report • In case of scan tracing getting stuck it is necessary the get it closed by seeking help from DFT team • Detach the scan chain even before the preplacement stage • With pre-placement results analyze the trans, cap and fanout numbers and optimally set these numbers for the flow. • Consider grouping the MBIST controller logic closer to the corresponding memory macros • During inplace optimization runs study the convergence w.r.t optimization options • Suitably order the sequence of optimization options and the number of iterations required. This would yield better result as well as reduce run time • Make the optimized placements as don’t touch when working with subsequent optimizations • Its is a good practice to do DRC/LVS check after each optimization Placement… • Set the setup margins to small +ve value (few ps). This will take care of CTS degradation (non-ideal clock) • Alternately have some clock uncertainty number for clocks to account for CTS needs • Enabling hold for timing check is waste of time during placement • Check the placement for both functional and DFT constraints • Choose appropriate report summary so that time for report generation and report size are minimized • Try and achieve timing closure at placement stage, legal placement of the cells happen during placement • Non convergence at placement stage would call for floorplan change • No significant timing improvement is expected in the subsequent stages • Too large a under performance may need a re-look of the design or consideration of a better library PD Journey • Fanout based delay modeling no more valid • Layout base delay modeling needed • Latest trend is Physical Virtual Prototyping Power management Multi-power domain design Power reduction Power optimized library • The VIP PowerSaver library includes cells specifically optimized for high-performance, lowvoltage operation as well as the level shifters and isolation gates that allow the designer to create electrically independent power islands capable of operating at different voltage levels and frequncies. The current library contains over 700 cells for the TSMC CL013G (130 nm) process, characterized for operation at 0.8, 1.0 and 1.2 volts. Libraries for additional processes will become available over time. Front-end and Back-end The more front-end teams consider the constraints imposed by implementation-level physical effects, the fewer iterations are likely to be required to achieve closure. Considerations for Die-Size Placement optimization simultaneously adjusts block placement, aspect ratio, rotation and mirroring to achieve a realistic die size estimate. A process shrink may exceed the maximum power density limits Average power dissipation is at least as important as maximum power dissipation for overall competitiveness of a processor product Methodology evolution Utilization Area Note that utilization ratio is the total cell area over the boundingbox area. On the one hand The layout area has to leave enough room for physical synthesis to buffer interconnects, size drivers and restructure logic and for CTS and routing to route the design; not leaving enough room will cause either an impact on timing or even the failure to complete the route. On the other hand The layout area should be minimized to minimize the die size and thus the cost of the chip. Therefore it is hard for the user to come up with the right utilization ratios.