CMOS Design
Methodologies
The Design Problem
Source: sematech97
A growing gap between design complexity and design productivity
Design Methodology
• Design process traverses iteratively between three abstractions:
behavior, structure, and geometry
• More and more automation for each of these steps
Design Analysis and Verification
• Accounts for largest fraction of design time
• More efficient when done at higher levels of
abstraction - selection of correct analysis
level can save multiple orders of magnitude
in verification time
• Two major approaches:
– Simulation
– Verification
Digital Data treated as Analog Signal
VD D
Sp
Vin
Vou t
5.0
G n ,p
In
D n, p
Ou t
V o ut (V)
Bp
3.0
tpHL
1.0
Bn
Sn
–1.0
0
0. 5
1
t (nsec)
Circuit Simulation
Both Time and Data treated as Analog Quantities
Also complicated by presence of non-linear elements
(relaxed in timing simulation)
1.5
2
Circuit versus Switch-Level Simulation
5 .0
Circuit
CI N
O U T [2]
3 .0
O U T [3 ]
1 .0
–1 .0
0
5
10
15
20
Switch
tim e ( n sec)
Design analysis and simulation
• Spice - exact but time
consuming
• discrete time steps
• circuit models
• timing simulation
with partitioning and
relaxation method
Gate level simulation
• faster than switch level
• functional simulation
• VHDL description used
Structural Description of Accumulator
ent it y accu m ul ato r i s
po rt ( -- def in it io n of in pu t an d o ut pu t ter mi na ls
DI: in b it _v ecto r(15 do wn to 0 ) -- a vecto r of 16 bi t w id e
DO : i no ut b it _v ecto r(15 do wn to 0 );
C LK : i n bi t
);
end accum u lat or;
archi tect ure s tru ctu re o f accum u lat or is
com p on ent reg -- def in it io n of regi st er po rt s
po rt (
DI : i n b it _v ecto r(15 d ow nt o 0 );
DO : ou t b it _v ecto r(15 d ow nt o 0 );
C LK : in bi t
);
end com p on ent ;
com p on ent add -- d efi ni ti on o f a dd er po rt s
po rt (
IN0 : in bi t_ vect or(1 5 do wn to 0);
IN1 : in bi t_ vect or(1 5 do wn to 0);
OU T0 : ou t b it _v ecto r(15 d ow nt o 0 )
);
end com p on ent ;
-- def in it io n of accu mu la to r st ru ctu re
si gn al X : bi t_ vect or(1 5 do wn to 0);
beg in
add 1 : ad d
po rt m ap (D I, D O, X); -- d efi nes po rt co nn ect ivi ty
reg1 : reg
po rt m ap (X , D O, C LK );
end st ruct ure;
Design defined as composition of
register and full-adder cells (“netlist”)
Data represented as {0,1,Z}
Time discretized and progresses with
unit steps
Description language: VHDL
Other options: schematics, Verilog
Behavioral Description of Accumulator
ent it y accu m ul ato r i s
po rt (
DI : i n i nt eger;
DO : in ou t i nt eger := 0 ;
C LK : in bi t
);
end accum u lat or;
archi tect ure b ehav io r o f accum u lat or is
beg in
pro cess (C LK )
vari abl e X : i nt eger := 0 ; -- in ter med ia te var ia bl e
beg in
if C L K = '1' t hen
X < = DO + D 1;
DO < = X ;
end if;
end pro cess ;
end beh avi or;
Design described as set of input-output
relations, regardless of chosen
implementation
Data described at higher abstraction
level (“integer”)
Behavioral simulation of accumulator
Discrete time
Integer data
(Synopsys Waves display tool)
Design verification
Electrical verification
• checking number of inversions between two
C2MOS gates
• checking pull-up and pull down ratio in
pseudo-NMOS gates
• checking minimum driver size to maintain
rise and fall times
• checking charge sharing to satisfy noisemargins
Design verification
Timing verification
• Spice too long simulation time
• RC delay estimated using PenfieldRubinstein-Horowitz method
• identification of critical path (avoid false
paths)
Timing Verification
Critical path
Enumerates and rank
orders critical timing paths
No simulation needed!
(Synopsys-Epic Pathmill)
Design verification
Formal verification
• components described behaviorally
• circuit model obtained from component
models
• resulting circuit behavior computed with
design specifications
• no generally acceptable verifier exists
Implementation approaches
Custom circuit design
•
•
•
•
•
•
labor intensive
high time-to-market
cost amortized over a large volume
reuse as a library cell
was popular in early designs
layout editor, DRC, circuit extraction
Layout editor
1. Polygon based (Magic)
2. Symbolic layout
• transistor symbols
• relative positioning
• compaction
• stick diagram description
• design rules automatically satisfied
• automatic pitch matching
Custom Design –
Layout Editor
Magic Layout Editor
(UC Berkeley)
Symbolic Layout
V DD
3
O ut
In
1
GND
Stick diagram of inverter
• Dimensionless layout entities
• Only topology is important
• Final layout generated by
“compaction” program
Design rule checking
• on-line DRC
- rules checked and errors flagged
during layout
• batch DRC
- post design verification
Circuit extraction
Circuit schematic derived from layout
transistors are build with proper geometry
parasitic capacitances and resistances evaluated
extraction of inductance requires 3D analysis
Cell-based design
•
•
•
•
reduced cost
reduced time
reduced integration density
reduced performance
Cell-based design
•
•
•
•
standard cell
compiled cells
module generators
macrocell place and route
Standard cell
• library contains basic logic cells
- inverter, AND/NAND, OR/NOR,
XOR/NXOR, flip-flop
- AOI, MUX, adder, compactor,
counter, decoder, encoder,
• fan-in and fan-out specified
• schematic uses cells from library
• layout automatically generated
Standard cell
• cells have equal heights
• cell rows separated by routing channels
Standard cell design
layout
Standard cell
and
description
Standard cell
• large design cost amortized over a large number of
designs
• large number of different cells with different fan-ins
• large fan-out for cells to be used in different designs
• synthesis tools made standard cell design popular
• standard cell design outperform PLA in area and speed
• standard cell benefit from multi level logic synthesis
Compiled cell
• cell layout generated on the fly
• transistor or gate level netlist used with transistor
size specified
• layout densities approach that of human designers
Circuit schematics
with
transistor sizing
Compiled cell
Generated layout
Automatic pitch matching
Module generators
• logic level cells not efficient for subcircuit design
- shifters, adders, multipliers, data paths, PLAs,
counters, memories
• Macrocell generators
- use design parameters like number of bits
• data path compilers
- use bit slice modules and repeat them N times
- generate interconnections between modules
Datapath compilers
Feedtroughs used to improve routing
Datapath
compilers
Datapath compiler
results
Macrocell place and route
• channel routing
- metal 2 horizontal segments
- metal 1 vertical segments
• over the block routing
(3-6 metal layers used)
Macrocell place and route
Array-based design
implementation
To avoid slow fabrication process which
takes 3-4 weeks :
•
•
•
•
mask programmable arrays
fuse based FPGAs
nonvolatile FPGAs
RAM based FPGAs
Mask programmable arrays
• gate-array
- similar to standard cell
• sea-of-gate
- routed over the cells (high density)
- wires added to make logic gates
• challenge in design is to utilize the
maximum cell capacity
• utilization < 75% for random logic design
Mask programmable arrays
Macrocell Design Methodology
Macrocell
Floorplan:
Defines overall
topology of design,
relative placement of
modules, and global
routes of busses,
supplies, and clocks
Interconnect Bus
Routing Channel
Macrocell-Based Design
Example
SRAM
SRAM
Data paths
Standard cells
Video-encoder chip
[Brodersen92]
Gate Array — Sea-of-gates
polysilicon
VD D
row s of
uncom m itted
cells
m etal
possible
conta ct
GN D
In 1
In 2
In 3
Uncommited
Cell
In4
routing
channel
Committed
Cell
(4-input NOR)
O ut
Sea-of-gate Primitive Cells
Oxide-isolation
PMOS
PMOS
NMOS
NMOS
NMOS
Using oxide-isolation
Using gate-isolation
Sea-of-gates
Random Logic
Memory
Subsystem
LSI Logic LEA300K
(0.6 mm CMOS)
Prewired Arrays
Categories of prewired arrays (or fieldprogrammable devices):
• Fuse-based (program-once)
• Non-volatile EPROM based
• RAM based
Programmable Logic Devices
PLA
PROM
PAL
Fuse-based FPGA’s
Actel
sea-of-gate
and standard
cell approach
Fuse-based FPGA’s
Example :
XOR gate obtained
by setting :
A=1, B=0, C=0, D=1,
SA=SB=In1,
S0=S1=In2
Fuse-based FPGA’s
Anti-fuse provides short (low resistance) when
blown out
Nonvolatile FPGA’s
•
•
•
•
•
•
programming similar to PROM
erasable programmable logic devices - EPLD
electrically erasable - EEPLD
design partitioned into macrocells
flip-flops used to make sequential circuits
software used to program interconnections to
optimize use of hardware
• input specified from schematics, truth tables, state
graphs, VHDL code
EPLD Block Diagram
Primary inputs
Macrocell
Courtesy Altera Corp.
RAM based (volatile) FPGA’s
• programming is fast and can be repeated
many times
• no high voltage needed
• integration density is high
• information lost when the power goes off
XILINX FPGA’s
•
•
•
•
configurable logic blocks CLBs used
five input two output combinational blocks
two D flip flops are edge or level triggered
functionality and multiplexers controlled by
RAM
• RAM can be used as look-up table or a
register file
XILINX FPGA’s
XILINX FPGA’s
• each cell connected to 4 neighbors
• routing channels provide local or global
connections
• switching matrices(RAM controlled) are
used for switching between channels
XILINX FPGA’s
XILINX FPGA’s (XC4025)
•
•
•
•
•
32 × 32 CLBs
25000 gates
422 k bites of RAM
operates at 250 MHz
32 kbit adder uses 62 CLBs
XILINX FPGA’s (XC4025)
Design synthesis
Circuit synthesis
• derivation of the transistors schematics from logic
functions
- complementary CMOS
- pass transistor
- dynamic
- DCVSL (differential cascode voltage switch logic)
• transistor sizing
- performance modeling using RC equivalent circuits
- layout generation
• synthesis not popular due to designers reluctance
Logic synthesis
• state transition diagrams, FSM, schematics,
Boolean equations, truth tables, and HDL used
• synthesis
- combinational or sequential
- multi level, PLA, or FPGA
• logic optimization for
- area, speed , power
- technology mapping
Logic optimization
• Expresso - two level minimization tool
(UCB)
• state minimization and state encoding
• MIS - multilevel logic synthesis (UCB)
Example :
S = (AB) Ci
Co= AB + ACi + BCi
Logic optimization
Multilevel implementation of adder generated by
MIS II cell library from University of Mississippi
Architecture synthesis
•
•
•
•
behavioral or high level synthesis
optimizing translation e.g. pipelining
Cathedral and HYPER tools
HYPER tutorial and synthesis example:
http://infopad.eecs.berkeley.edu/~hyper
Architecture synthesis example
Architecture synthesis
Vertical and Orthogonal CMOS
COSMOS
Savas Kaya
– Stack two MOSFETs under a common gate
– Improve only hole mobility by using strained SiGe channel
• pMOS transconductance equal to nMOS
– Reduce parasitics due to wiring and isolating the sub-nets
Conventional
CMOS
COSMOS:
Complementary
Orthogonal
Stacked MOS
Technology Base
• Strained Si/SiGe layers
– Built-in strain traps more carriers and increases mobility
• Equal+high electron and hole mobilities (Jung et al.,p.460,EDL’03)
• SOI (silicon-on-Insulator) substrates
– active areas on buried oxide (BOX) layer
– Reduces unwanted DC leakage and AC parasitics
Mizuno et al.,
p.988, TED’03
Cheng et al.,
p.L48, SST’04
COSMOS Structure
• Single common gate: mid-gap metal or poly-SiGe
• Ultra-thin channels: 2-6nm to control threshold/leakage
– Strained Si1-xGex for holes (x0.3)
– Strained or relaxed Si for electrons
• Substrate: SOI
COSMOS Structure - 3D View I
• Single gate stack: mid-gap metal or poly-SiGe
– Must be engineered for a symmetric threshold
In units
of mm
COSMOS Structure - 3D View II
• Conventional self-aligned contacts
– Doped S/D contacts: p- (blue) or n- (red) type
• Inter-dependence between gate dimensions:
W 
 L 
  
 
 L nMOS
W pMOS
COSMOS Gate Control
• A single gate to control both channels
– High-mobility strained Si1-xGex (x0.3) buried hole channel
• High Ge% eliminates parallel conduction and improves mobility
• Lowers the threshold voltage VT
– Electrons are in a surface channel
– Requires fine tuning for symmetric operation
3D Characteristics: 40nm Device
• Symmetric operation
– No QM corrections
• Lower VT
– Features in sub-threshold
operation
• Related to p-i-n parasitic
diode included in 3D
COSMOS Inverter
• No additional processing
– Just isolate COSMOS layers and establish proper contacts
– Significantly shorter output metallization
Top view
Peel-off top views
3D TCAD Verification
• Inverter operation verified in 3D
40nm
COSMOS
NOT gate
driving
CL=1fF
load
Applications
• Low power static CMOS:
– Should outperform conventional devices in terms of speed
• Multiple input circuit example: NOR gate
• Area tight designs :
– FPGA, Sensing/testing, mpower etc. ?
Descargar

3. Logic Simulation Variable both function &timing Also …