-
Test application time is a function of the
number of FFs
scanned.
-
Test time is reduced if more than one chain is operated in
parallel
.
-
This is particularly effective for
BIST
design, because the pseudo-random tests sets are extremely long.
-
Also, since some circuits use
multiple clocks
to control different parts of the circuit, it is easier to add scan to the different sections separately.
-
For external applied tests (not BIST), multiple chains require
more than one
pin, unless they are multiplexed with several PIs.
-
Multiple chains pose other challenges such as balancing the number of elements in each chain, while minimizing interconnect length.
-
Complexity/area is increased through the use of scan FFs.
-
Scan FF area overhead is usually reasonable since the fraction of FFs per combinational logic is small.
-
The most significant overhead is in
routing
.
-
Research has has focused on scan ordering heuristics that minimize the area of the extra interconnect.
-
Additional I/Os needed.
-
For LSSD, SI, SO, SE,
f
2
and
f
3
are needed.
-
SI and SO may be multiplexed with a PI and a PO.
-
Test time per pattern is increased (due to the serial nature of scan).
-
Scan can degrade performance.
-
MUX
scan clearly adds delay.
-
Two-port
FFs or latches minimize the additional delay, as indicated previously.
-
Test generation costs can be significantly reduced, which can also lead to higher fault coverage.
-
Increased
power dissipation
under "At-Speed" testing.
-
External scan is applied at a slower rate than the operating frequency.
-
Scan combined with BIST, however, is often run at-speed, and the heat generated by clocking all FFs simultaneously can be excessive.
-
Some designs are not ameanable to scan.
-
In order to reduce some of the penalties associated with full scan, only a
subset
of the FFs need to be made scannable.
-
Note that the circuit in this case remains sequential and sequential test methods are still required.
-
The main trade-off is between the
number and identify of the FF
s to make scannable and
fault coverage
, where the "best" strategy is defined by:
-
One that requires the minimal effort for sequential ATPG.
-
One that minimizes area overhead.
-
One that minimizes interconnect.
-
One that minimizes additional delays.
-
Easing test pattern generation seems to be the most important factor.
-
The complexity of sequential ATPG is a function of the number of FFs in
feedback loops
or the number of FFs between two scanned FFs.
-
Strategies for the selection of the appropriate FFs to make scannable are based on:
-
Testability analysis
.
-
Usually not useful by itself since TMs are not necessarily good predictors of TPG difficulty.
-
Structural analysis
.
-
Test pattern generation
.
-
One strategy to determine the value of making a FF scannable is to compute the fault covereage after making just that FF scannable.
-
This yields the best results but is computationally expensive.
-
A feasible strategy is to combine TMs with the second two strategies.
-
Structural analysis creates a
directed graph
model of the CUT.
-
Here, the FF are the vertices and the combinational logic gates are the edges between the FFs.
-
A cyclic graph
S
is represented as G(V, E) where vertex, v
i
, represents a FF and edge (v
i
, v
j
) is a connection between the output of one FF and the input of another.
-
A
path
is a sequence of edges from v
i
and v
n
.
-
The
sequential depth
of a circuit is the longest cycle in the circuit.
-
Circuits with larger sequential depths increase the difficulty of TPG.
-
Starting with a graph, the idea is to select the
minimum # of FF
s that change the graph into an acyclic graph.
-
The minimum # of FFs is called the
Minimal Feedback Vertex Set
(
MFVS
).
-
Under these conditions, it is possible to use a combinational ATPG with minor changes.
-
Determining the
MFVS
is NP-complete and therefore, heuristics are used.
-
The
adjacency matrix
is given as:
|
FF1
|
FF2
|
FF3
|
FF4
|
Tot
|
FF1
|
FF2
|
FF3
|
FF4
|
Tot
|
FF1
|
0
|
1
|
0
|
0
|
1
|
0
|
1
|
0
|
0
|
1
|
FF2
|
0
|
0
|
1
|
1
|
2
|
0
|
0
|
1
|
1
|
2
|
FF3
|
0
|
1
|
0
|
1
|
2
|
0
|
1
|
0
|
1
|
2
|
FF4
|
1
|
0
|
0
|
1
|
2
|
0
|
0
|
0
|
0
|
0
|
Tot
|
1
|
2
|
1
|
3
|
|
0
|
2
|
1
|
2
|
|
-
Element a
ij
is
1
if FF
i
feeds FF
j
.
-
One heuristic to determine how to break the cycles is to sum the
indegrees
and
outdegrees
.
-
This provides a measure of the # of
possible
loops through the vertex.
-
Here,
FF4
has the largest estimate, 5, followed by
FF2
with 4.
-
Scanning
FF4
breaks all loops except
FF2
-
FF3
of depth 2.
-
For an acyclic graph, scan
FF2
or
FF3
.
-
Proximity information may be used to determine other choices.
-
Consider the one clock MUX version of scan.
-
After test application, the system is clocked once to latch the response.
-
During scan out, all FFs are clocked and the non-scanned FFs will latch new values changing the circuit state.
-
In order to retain the non-scan FF state after test application, one can:
-
Use separate clocks for the scan and non-scan FFs.
-
Add a
hold mode
to the non-scan FFs.
-
Scan-set
can also be used, with no special provisions for partial scan.
-
The order of the FFs affects two of the cost measures,
interconnect length
and
test application time
.
-
This is usually done after synthesis with the netlist, before place and route.
-
It is also possible to place and then find the best way to "stitch" them together.
-
Note the optimal strategy for test application is often in
contradiction
with the optimal routing strategy.
-
Multiple chains is the best means of dealing with this, but costs extra pins or multiplexers.
-
In datapath circuits, registers are often used as
input or output only
.
-
Scheduling multiple test sessions can be an advantage since it minimizes test application time.
-
For example, data path with
20 FF
s:
-
Assume it's possible to
merge
the two tests for a total of
300
test vectors.
-
The total number of clocks is 300*(20 + 1) + 20 =
6320
.
-
(+1 is used to latch the internal values).
-
However, R
1
and R
2
are drivers for K
1
and K
2
, R
4
and R
5
are recievers while R
3
is a driver/receiver.
-
Let's analyze the left configuration.
-
Example (cont).
-
The first 50 (common) patterns are shifted into the 14 FFs of R
1
, R
2
and R
3
while the results of the previous pattern is shifted out of R
3
-> R
5
.
-
Test time for this portion is 50(14 + 1) =
750
cycles.
-
The remaining 250 patterns are to be applied through R
2
, which requires 12 cycles for each pattern.
-
Again, the results of the previous patterns are scanned out of R
3
and R
5
.
-
Test time for this portion is 250(12 + 1) =
3250
cycles.
-
Total test time is 750 + 3250 + 8 =
4008
, a
37%
reduction in time with a corresponding heat dissipation benefit.
-
The right configuration gives 50(18 + 1) and 250(4+1) + 4 =
2204
cycles.
-
An improvement of
45%
over previous and a
65%
over original.
-
It is important to minimize
interconnect length
, particularly with interconnect delay on order with gate delays and large interconnect chip areas.
-
Determining FF order can NOT be done only at the
logic level.
-
It must be considered at the
physical layer
as well.
-
In addition to the serial FF connections, other wires to minimize include
SE
the
Clk
.
-
Note that the cost function consists of both interconnect wire lengths and ordering constraints to minimize test application time, as given above.
-
At the physical level (after placement), the problem can be modeled as finding the
shortest route
such that each FF is visited once and only once.
-
Let's use graph theory to find a solution.
-
Let G(
V
,
E
,
W
) represent the CUT where:
-
represents a FF.
-
is a connection from v
i
to v
j
.
-
is the length of the edge (v
i
, v
j
).
-
The elements of the adjacency matrix are the weights w
ij
.
-
Krusal's
greedy algorithm is used to find the
minimal spanning tree
.
-
Here we start with the
smallest weight
, w
ij
, which connects v
i
and v
j
.
-
Vertices v
i
and v
j
are removed.
-
The smallest weighted edge from v
i
or v
j
to v
k
is added.
-
Repeat the process using v
i
and v
k
until all vertices are visited.
-
Note that the algorithm starts with a matrix in which all distances between FFs has been computed in advance (n
2
distances).
-
Heuristics (e.g.
neighborhoods
) can be used to reduce the # of elements.