-
Full complementary static CMOS gates may be undesirable because:
-
The area overhead.
-
Their speed may be too slow.
-
The function may not be feasible as a full complementary structure (e.g. PLA).
-
Smaller faster gates can be implemented at the cost of:
-
Increased design time.
-
Increased operational complexity.
-
Decreased operational margin.
-
Full complementary gates can be designed as
ratioless
circuits:
-
A fixed ratio in size between pull-up and pull-down structures is not required for proper operation.
-
Unlike those we will consider now.
-
Gain ratio of n-driver transistors to p-transistor load (beta
driver
/beta
load
), is important to ensure correct operation.
-
Accomplished by ratioing the n and p transistor sizes.
-
Pull-up time improved by virtue of the active switch (p-transistor can be much larger).
-
Pull-down time increased due to the ground switch.
-
What is wrong with cascading these structures?
-
(Hint: Consider the delay in the discharge of the left-most n-logic block at the start of the evaluate phase).
-
These structures can be cascaded.
-
In a cascaded set of logic blocks, each stage evaluates and causes the next stage to evaluate (in the same way a line of dominos fall).
-
Other forms of CMOS logic include:
-
BiCMOS Logic
-
Clocked CMOS Logic (C
2
MOS).
-
NP Domino Logic (Zipper CMOS).
-
Cascade Voltage Switch Logic (CVSL).
-
Source Follower Pull-up Logic (SFPL).
-
(See Weste and Eshraghian for details.)
-
Where should one use what gate?
-
Complementary: Best option for most cases. Safe, fast, no DC power.
-
Pseudo-nMOS: Large fan-in NOR gates, i.e. PLAs, ROMs. DC power.
-
Transmission gate: Speed advantage, good for complex boolean functions.
-
CMOS domino logic: Low-power, high speed. Requires simulation!
-
Majority of VLSI systems are Finite State machines and Pipelined machines:
-
One of the most important decisions made at the start of a design is the selection of a clocking strategy.
-
It effects:
-
How many transistors are used per storage element.
-
How many clock signals need to be routed throughout the chip.
-
Topics:
-
Latch, Master-Slave Flip-flop and Edge-Triggered Flip-flop designs.
-
Setup and Hold time and clock race conditions.
-
CMOS Static and Dynamic Flip-flops.
-
Single phase clocking, clock skew/slew.
-
Two-phase clocking techniques.
-
Clock generation techniques.
-
The ambiguity of having a non-allowed mode caused by trigger pulses going active simultaneously can be avoided by adding two feedback lines:
-
Note if both J and K are high, and clock pulses, the output is complemented.
-
However, doing so enables the other input and the FF
oscillates
.
-
This places some stringent constraints on the clock pulse width (e.g. < than the propagation delay through the FF).
-
Synchronous circuit:
-
Changes in the output logic states of all FFs in a design are synchronized with the clock signal, phi.
-
Note that the:
-
T FF (
toggle FF
) is a special case of the JK with J and K tied together.
-
D FF (
delay FF
) is a special case with J and K connected with complementary values of the D input.
-
Here the D FF generates a delayed version of the input signal synchronized with the clock.
-
These FFs are also called
latches
.
-
A FF is a latch if the gate is transparent while the clock is high (low).
-
Any changes in the input are reflected in the output after a nominal delay.
-
The transparent nature can cause
race
problems:
-
Problem with master-slave approach:
-
The circuit is sensitive to changes in the input signals as long as
phi
is high.
-
If the inputs do not remain constant when the clock is high, the master follows
D
, which, for example, consumes power.
-
The fix is to allow the state of the FF to change only at the rising (falling) edge of the clock.
-
The modification applied to the JK FF is shown below.
-
Note that the inputs must be stable for some time before the clock goes low.
-
This is also true for the master-slave D FF, but the constraints are different.
-
Let's first define some terms.
-
Timing diagram showing the terms used to define the proper operation of a Flip-flop.
-
Tc: Clock Cycle Time.
-
T
s
: The amount of time
before
the clock edge that the D input has to be stable.
-
T
h
: Data has to be held for this period while the clock travels to the point of storage.
-
T
q
: Clock-to-Q delay: Delay from the positive clock input to the new value of Q.
-
Depending on the design, one or both of T
s
and T
h
may have to be non-zero.
-
For example, the master-slave D FF is likely to require a longer setup time than the edge-triggered D FF.
-
Edge triggered FF prevents the "master" from following the
D
input so the FF's internal delay does not affect setup time.
-
The hold time interval starts with the beginning of the clock transition.
-
Clock skew and slew and other design details of the FF affect the hold time.
-
Toggle Flip-Flop with Asynchronous Clear:
-
Two possible strategies to implement clocked systems:
-
Latches are a more economical implementation strategy but are transparent on half of the clock cycle, and cannot be used in feedback systems.
-
Also, the following constraint must be met for latches:
-
T
d
< T
c
/2 - T
q
- T
s
-
where T
d
is the worst case propagation delay, T
c
is the clock cycle time, T
q
is the Clock-to-Q time of latch A and T
s
is the setup time for latch B.
-
Occurs when the data input to the register does not obey the setup and hold-time constraints.
-
Delays in the clock line to Reg B (hold-time violation).
-
New data stored instead of previous data:
-
Delays in the combinational logic that are larger than the clock cycle time (setup violation).
-
Data arrives late at Reg B, old data retained instead of latching new data.
-
As you can see, designers have to walk a temporal 'tight-rope', e.g., they have to minimize clock skew while considering worst and best case delays through combinational logic.
-
Full complementary version of the master-slave FF requires 38 transistors !
-
Positive feedback is not the only means to implement a memory function.
-
A
capacitor
can act as a memory element as well.
-
In this case, a
periodic refresh
is required (in the millisecond range) due to leakage (hence the word
dynamic
).
-
Consider the following "cheaper" (1/2 transmission gate)
positive level-sensitive
latch as a step toward deriving a dynamic FF:
-
A master-slave FF is created by cascading two of these latches and reversing the clocks.
-
The problem with this latch is that phi
1
and
phi
1
might overlap, which may cause two types of failures:
-
Node A can become undefined as it is driven by both
D
and
B
when phi
1
and phi
1
are both high.
-
D
can propagate through both the master and slave if both phi
1
and
phi
1
are high simultaneously for a long enough period.
-
Clock skew causes conflicts and transparency.
-
Clock slew (slow rise and fall times) can also cause transparency:
-
Clock skew is a dominant problem in current high performance designs.
-
Pseudostatic FF
:
The fix is to use two non-overlapping clocks phi
1
and phi
2
:
-
A large t
phi-12
allows proper operation even in the presence of clock skew.
-
Note that node
A
floats (dynamic) during the time period t
phi-12
but is driven during t
phi-1
and t
phi-2
.
-
Hence, the name
pseudostatic
.
-
This version is simplier (6 trans) and is often used in pipelined datapaths for microprocessors and signal processors.
-
Disadv: 2 non-overlapping clocks required (4 if transmission gates are used).
-
These implementations MUST be simulated at all process corners (under worst-case conditions).
-
C
2
MOS: A clever method which is
insensitive
to clock skew:
-
C
2
MOS is
insensitive
to overlap as long as the rise and fall times of the clk edges (clock slew) are sufficiently small:
-
Races
are just not possible since the overlaps activate either the pull-up or the pull-down networks but never both simultaneously.
-
The inverters force 0-1 and 1-0 propagation modes only.
-
However, if the rise and fall times of the clock are slow, there exists a time slot in which both n- and p-transistors are conducting simultaneously.
-
Correct operation requires the rise/fall times be smaller than about 5 times the propagation delay through the FF.
-
This is not hard to meet in practical designs, making C
2
MOS especially attractive in high speed designs where avoiding clock overlap is hard.
-
Lots of other possible latch configurations, static and dynamic -- see Weste and Eshraghian.
-
Clock skew is minimized but area cost is severe.
-
Transistors in the inverter and pass gate should be similar in size.
-
Keep them small and use buffers to drive the load.
-
Note: The routing load MUST also be balanced on each of the clk lines.
-
Four-phase
clocking strategies discussed in Weste and Eshraghian.
-
Modern designs tend to minimize the number of clock phases used due to problem of generating and distributing multiple clocks.
-
Single phase
schemes used for complex, high-speed CMOS circuits.
-
Assume all the registers in a large CMOS design result in a capacitive load of 2000 pF. What is the peak current and average dynamic power?
-
Two techniques:
-
A single large buffer (cascaded inverters): Use when the module has a large number of diverse modules, i.e. a microprocessor.
-
A distributed-clock-tree: Use when design is highly structured and repetitive, i.e. a datapath.