Thistutorial will discuss the various views that make-up a standard-celllibrary and then illustrate how to use a set of Synopsys and Cadence ASICtools to map an RTL design down to these standard cells and ultimatelysilicon. The tutorial will discuss the key tools used for synthesis,place-and-route, simulation, and power analysis. This tutorial requires entering commands manually for each of the tools to enable students to gain a better understanding of the detailed steps involved in this process. The next tutorial will illustrate how this process can be automated to facilitate rapid design-space exploration. This tutorial assumes you have already completed the tutorials on Linux, Git, PyMTL3, and Verilog.
The following diagram illustrates the five primary tools we will be usingin ECE 5745 along with a few smaller secondary tools. Notice that theASIC tools all require various views from the standard-cell library.
We use Synopsys VCS to compile and run both 4-state RTL and gate-level simulations. These simulations help us to build confidence in our design as we push our designs through different stages of the flow. From these simulations, we also generate waveforms in .vcd (Verilog Change Dump) format, and we use vcd2saif to convert these waveforms into per-net average activity factors stored in .saif format. These activity factors will be used for power analysis. Gate-level simulation is an extremely valuable tool for ensuring the tools did not optimize something away which impacts the correctness of the design, and also provides an avenue for obtaining a more accurate power analysis than RTL simulation. Though Static Timing Analysis (STA) is much better because it analyzes all paths, GL simulation also serves as a backup to check for hold and setup time violations (chip designers must be paranoid!)
We use Synopsys Design Compiler (DC) to synthesize our design, whichmeans to transform the Verilog RTL model into a Verilog gate-levelnetlist where all of the gates are selected from the standard-celllibrary. We need to provide Synopsys DC with abstract logical andtiming views of the standard-cell library in .db format. In addition to the Verilog gate-level netlist, Synopsys DC can also generate a .ddc file which contains information about the gate-level netlist and timing, and this .ddc file can be inspected using Synopsys Design Vision (DV).
We use Cadence Innovus to place-and-route our design, which means toplace all of the gates in the gate-level netlist into rows on thechip and then to generate the metal wires that connect all of thegates together. We need to provide Cadence Innovus with the sameabstract logical and timing views used in Synopsys DC, but we alsoneed to provide Cadence Innovus with technology information in.lef, and .captable format and abstract physical views of thestandard-cell library also in .lef format. Cadence Innovus willgenerate an updated Verilog gate-level netlist, a .spef file whichcontains parasitic resistance/capacitance information about all netsin the design, and a .gds file which contains the final layout. The.gds file can be inspected using the open-source Klayout GDSviewer. Cadence Innovus also generates reports which can be used toaccurately characterize area and timing.
We use Synopsys PrimeTime (PT) to perform power analysis of ourdesign. We need to provide Synopsys PT with the same abstractlogical, timing, and power views used in Synopsys DC and CadenceInnovus, but in addition we need to provide switching activityinformation for every net in the design (which comes from the .saiffile), and capacitance information for every net in the design (whichcomes from the .spef file). Synopsys PT puts the switching activity, capacitance, clock frequency, and voltage together to estimate the power consumption of every net and thus every module in the design, and these estimates are captured in various reports.
Extensive documentation is provided by Synopsys and Cadence for theseASIC tools. We have organized this documentation and made it available toyou on the public coursewebpage. Theusername/password was distributed during lecture.
A standard-cell designer will use the PDK to implement the standard-celllibrary. A standard-cell library is a collection of combinational andsequential logic gates that adhere to a standardized set of logical,electrical, and physical policies. For example, all standard cells areusually the same height, include pins that align to a predeterminedvertical and horizontal grid, include power/ground rails and nwells inpredetermined locations, and support a predetermined number of drivestrengths. A standard-cell designer will usually create a high-levelbehavioral specification (in Verilog), circuit schematics (in SPICE), andthe actual layout (in .gds format) for each logic gate. The Synopsysand Cadence tools do not actually use these low-level implementations,since they are actually too detailed. Instead these tools use abstractviews of the standard cells, which capture logical functionality,timing, geometry, and power usage at a much higher level.
Just like with a PDK, gaining access to a real standard-cell library isdifficult. It requires gaining access to the PDK first, negotiating witha company which makes standard cells, and usually signing morenon-disclosure agreements. In this course, we will be using the Nangate45nm standard-cell library which is based on the open FreePDK45 PDK.
For students with a circuits background, there should be no surpriseshere, and for those students with less circuits background we will coverbasic static CMOS gate design later in the course. Essentially, thisschematic includes three NMOS transistors arranged in series in thepull-down network, and three PMOS transistors arranged in parallel in thepull-up network. The PMOS transistors are larger than the NMOStransistors (see W= parameter) because the mobility of holes is lessthan the mobility of electrons.
Note that we are using the .lyp file which is a predefined layer colorscheme that makes it easier to view GDS files. To view the 3-input NANDcell, find the NAND3X1 cell in the left-hand cell list, and then choose_Display > Show as New Top from the menu. Here is a picture of thelayout for this cell.
Diffusion is green, polysilicon is red, contacts are solid dark blue,metal 1 (M1) is blue, and the nwell is the large gray rectangle over thetop half of the cell. All standard cells will be the same height and havethe nwell in the same place. Notice the three NMOS transistors arrangedin series in the pull-down network, and three PMOS transistors arrangedin parallel in the pull-up network. The power rail is the horizontalstrip of M1 at the top, and the ground rail is the horizontal strip of M1at the bottom. All standard cells will have the power and ground rails inthe same place so they will connect via abutment if these cells arearranged in a row. Although it is difficult to see, the three input pinsand one output pin are labeled squares of M1, and these pins are arrangedto be on a predetermined grid.
Note that the Verilog implementation of the 3-input NAND cell looksnothing like the Verilog we used in ECE 4750. This cell is implementedusing Verilog primitive gates (e.g., not, and) and it includes aspecify block which is used for advanced gate-level simulation withback-annotated delays.
This is just a small subset of the information included in the .libfile for this cell. We will talk more about the details of such .libfiles later in the course, but you can see that the .lib file containsinformation about area, leakage power, capacitance of each input pin,logical functionality, and timing. Units for all data is provided at thetop of the .lib file. In this snippet you can see that the area of thecell is 1.064 square micron and the leakage power is 18.1nW. Thecapacitance for the input pin A1 is 1.59fF, although there isadditional data that captures how the capacitance changes depending onwhether the input is rising or falling. The output pin ZN implementsthe logic equation !((A1 & A2) & A3) (i.e., a three-input NAND gate).Data within the .lib file is often represented using one- ortwo-dimensional lookup tables (i.e., a values table). You can see twosuch tables in the above snippet.
This file contains information about the minimum dimenisions of wires onM1 and the resistance of these wires. It also contains a table of wirecapacitances with different rows for different wire widths and spacings.The ASIC tools can use this kind of technology information to optimizeand analyze the design.
Finally, a standard-cell library will always include a databook, which isa document that describes the details of every cell in the library. Takea few minutes to browse through the Nangate standard-cell librarydatabook located on the class Canvas page here:
Our goal in this tutorial is to generate layout for the sort unit fromthe PyMTL3 tutorial using the ASIC tools. As a reminder, the sort unittakes as input four integers and a valid bit and outputs those same fourintegers in increasing order with the valid bit. The sort unit isimplemented using a three-stage pipelined, bitonic sorting network andthe datapath is shown below.
You can just copy over your implementation of the MinMaxUnit from whenyou completed the PyMTL3 tutorial. If you have not completed the PyMTL3tutorial then you might want to go back and do that now. Basically theMinMaxUnit should look like this:
Take a moment to open up the translated Verilog which should be in a filenamed SortUnitStructRTL__nbits_8__pickled.v. Try to see how both the structuralcomposition and the behavioral modeling translates into Verilog. Here isan example of the translation for the MinMaxUnit. Notice how PyMTL3 willoutput the source Python embedded as a comment in the correspondingtranslated Verilog.
The Verilog module name includes a suffix to make it unique for aspecific set of parameters. Although we hope students will not need toactually open up this translated Verilog it is occasionally necessary.For example, PyMTL3 is not perfect and can translate incorrectly whichmight require looking at the Verilog to see where it went wrong. Othersteps in the ASIC flow might refer to an error in the translated Verilogwhich will also require looking at the Verilog to figure out why theother steps are going wrong. While we try and make things as automated aspossible, students will eventually need to dig in and debug some of thesesteps themselves.
3a8082e126