Decoder Machine Design and Its Operation

Chapter 4 Adaptive Encoding Scheme Using Embedded Memory for Low-Cost and

4.3 Adaptive Encoding

4.3.2 Decoder Machine Design and Its Operation

To decode a test pattern, a decoder machine needs to be developed. The test architecture for Adaptive Encoding is shown in Figure 4.4 (a). In our implementation, we only require one scan- in signal, which facilitates the usage of a low-cost ATE. The encoded test data serially shifts through this scan-in pin to the decoder. The scan clock, Sclk, for scan chains is controlled by the decoder. Figure 4.4 (b) shows that a test pattern is stored in the embedded memory and the decoder machine configures it to the next pattern and loads it onto scan chains. The main actions that this decoder machine performs for each test pattern are:

l Step 1: Loads three header fields, the number of packets, the width of difference address and the width of data length to the decoder machine;

l Step 2: Loads and uses packets to configure the next test pattern by

communicating with the memory;

l Step 3: Loads the test pattern in the memory to scan chains.

(c)

Figure 4.4 The three main steps that the decoder machine does: (a) step 1:

loads three head fields (b) step 2: configures test pattern (c) step 3: loads test pattern

In Step 1, in addition to the two header fields mentioned in Section 4.3.1, there is also a header field, which indicates how many packets are to be loaded into the decoder machine for each test pattern. In Step 2, encoded packets are sent to the decoder machine to be configured into blocks in the memory. To calculate the actual address of the difference address, an adder of a length of



^log² ^N



bits is used to

accumulate the address for each packet. The adder is initialized to zero before each pattern is decoded. Once the actual address is calculated for each packet, decoder machine fetches a block from the memory to modify it. For this action, there are three sub-steps. First, the decoder machine gets the address from the adder and then the address is divided in two parts, block number, say b, and offset, say k. block number is the physical block number in the memory and offset is the starting location where the block is to be replaced from. After block number is sent to the memory, the selected block will be read and then updated with data in the buffer, as will be described later.

Figure 4.5 shows the detail architecture for the decoder machine. How input data is decoded is explained: In the figure, at the beginning of Step 2, the machine resets all flip flops of the buffer, sets load to a 1 and sends offset to the offset decoder. After that, data is loaded into a specific flip flop which is selected by the offset decoder while other flip flops behave like shift registers. Once the buffer is full or ready, the decoder sets load to a 0 and then selects a target block from the memory. Thereafter, the selected block loaded from memory is configured by XOR gates with the DIFF pattern in the buffer. Finally, the decoder machine writes the modified block back to the memory. The decoder machine runs at the system clock to perform those actions, which takes two system cycles for changing and writing a block within one test clock.

Therefore, the ATE can continuously send packets to the decoder machine without the need of additional memory space to store replacement words. If data is too large to be filled in the buffer, the decoder increases the address counter and goes on to select the next block from memory to be modified without interruption.

In Figure 4.5, the relationship between address and the memory organization is also shown, where the address from the adder’s output is



^log² ^N



bits, and the buffer has m flip-flops, which is equal to the block size of the memory. The number of scan

chains can be different from m, but here we assume it is m for simplicity. Two fields, difference address and data, of a packet are shown to explain how the decoder recognizes them; in fact, they both come from the scan-in pin. With offset and an offset decoder, the machine achieves random access for the selected block and a variable size of encoded block.

After all packets for one test pattern are decoded, the decoding process goes into the final step. The machine loads the configured test pattern in the memory to scan chains and shifts the test result out at the same time. Also in this step, ATE should stop sending data because the decoder is busy in loading a test pattern. In our scheme, since the time when Step 2 will finish can be known in advance (we can do that by analyzing packets), the synchronization problem is avoided by inserting a vector repeat filling instruction [90] at the time when Step 3 starts. Although vector repeat instructions in the ATE do not come for free, their numbers needed equal to the number of test patterns and are negligible.

adder

Figure 4.5 Relationship between decoder’s address and memory blocks, and implementation of the memory buffer to support updating memory block in

a random access fashion

Among the three steps to decode and shift a pattern, the test time in Step 2 dominates among all times of the steps; therefore, we describe this step in more details by using an example as in Figure 4.6, where a DIFF pattern is decoded into two packets. In Figure 4.6(a), after three cycles, three bits for difference address are loaded and then the decoder calculates the address for the first to-be-flipped bit. Once

the address is obtained, the block is fetched to be modified (Figure 4.6(b)). In Figure 4.6(c), after two bits of data for the first block are loaded, the decoder modifies and writes the block back to the memory. In Figure 4.6(d), the decoder automatically fetches the next block to be modified since the loading of data is not done yet. After the next block is also modified, the block is updated in Figure 4.6(e). Next, for the second packet, the decoder will calculate the new address for the first to-be-flipped bit (in Figure 4.6(f)). Figure 4.6(g) and Figure 4.6(h) are similar as the first packet.

DIFF: 0010 1000 0000 1000

Figure 4.6 An example to demonstrate the decoding process in Step 2

4.4 “Two-Phase Test” and “Test Vector Reordering”

Techniques for Improvement on Test Compression

The data compression of Correlation-related compression method is affected by the total number of bit flips between test patterns. The proposed scheme, when incorporated with the techniques of “two phase test” and “pattern re-ordering” [86], can be further improved its test compression efficiency. The “two-phase test” consists of generating test patterns in two phases, namely, it generates test patterns randomly in the first phase and then generates patterns deterministically, which aim to test specific faults, in the second phase. For the first phase, patterns are randomly generated to detect easy-to-detect faults. The way to generate a random pattern using our decoder machine is to randomly generate m bit first. Then those m bits are loaded to the decoding buffer and those m bits are shifted to scan chains from the buffer continuously until scan chains are full. For one scan chain, data of all scan cells are the same. For example, if a decoding buffer has, i.e., is connected with, four scan chains and each scan chain have eight scan cells, and the generated four bits are 1101, the scan cells of the first, second and fourth scan chains are all bit 1 while the third scan chain is bit 0. The decoding buffer needs eight cycles to fill the four scan chains.

Therefore, each random pattern needs m bits as a seed and they are repeatedly shifted into a CUT. This saves the scan- in power since the same bit is shifted into a scan chain.

In the above, although the bit dependence between scan cells limits the fault coverage, patterns generated deterministic ally in the second phase increase the fault coverage.

Sequences of test patterns of different orders results in different number of bit flips.

In the second phase, in addition to the aim to increase the fault coverage, a test vector

reordering technique is used to reduce the number of bit flips. The problem of finding a good order of test patterns to reduce the number of bit flips can be formulated as a Minimum Bit Flip Problem (MBFP) [86]. A process is proposed to solve this problem as follows:

Suppose N test patterns are to be reordered, a graph is built with N nodes, which represent the N test patterns. Each edge between two nodes of the directed graph represents an applied order of patterns. For example, in Figure 4.7 (a), edge E1 represents that node A (pattern 1x00) is applied prior to node B (pattern 1111) while edge E2 represents the reverse order. Each edge is associated with a cost which indicates the number of bit flips while patterns change from one to the other. For test patterns without don’t care bits, the numbers of bit flips are the same for the two edges. For this case, only undirected edge s are needed, which is shown in Figure 4.7 (b). However, test compression methods often compress test patterns with don’t care bits and thus the number of bit flips is dependent on the applying order of patterns and directed edges are used in the graph. For all patterns to be ordered, a complete graph can be built with nodes connected with each othe r with all the edges. The test vector reordering problem can then be formulated as an Asymmetric Traveling Salesman Problem (ATSP) and a heuristic algorithm is used to solve it.

A B A B

E1 (cost=2)

(a) (b)

1x00 1111 1000 1110

E1 (cost=2 or 3)

E2 (cost=2)

Figure 4.7 Graphs to model test sequence ordering: (a) Direct edges for partially specified patterns (b) Undirected edges for fully specified patterns

In Figure 4.7, for patterns with don’t care bits, a cost model is used to help decide the cost of bit flip between two nodes. For example, the cost of edge E2 in Figure 4.7(a) is 2 since the third and fourth bit of node B need to be changed from 1 to 0.

However, as for E1, the cost may be 2 if the “x” of A is mapped to 1 or may be 3 if it is mapped to 0. Four cost models may be used [86]. Different cost model will lead to different results in solving the ATSP problem. Here we use an improved Estimated cost model of [86] to model the costs of edges.

The idea of Estimated cost model is to associate each corresponding bits of patterns a probability for which the bits are logic “1”. For example, for four patterns, 010, x11, 1x0, 1xx; for the first bit, there are two 1s and a 0 at the first bit position, the probability of occurrence of 1 is 2/3 (“x” is not counted), similarly, for the second bit is 2/2 = 1 and for the third bit is 1/3 respectively. When we assign values to “x”s of the above patterns to obtain their DIFF pattern, we will assign the first bit of the pattern x11 to be “1” since the probability is 2/3. Similarly, the second bits of the patterns 1x0 and 1xx are assigned to be “1” and the third bit of the pattern 1xx is assigned to be “0”. In this way, all don’t care bits are assigned and a DIFF pattern can be calculated. Then, the improved estimated cost model calculates the encoded data volume of the DIFF pattern and assigns the cost to edges of the modeled graph.

在文檔中低耗能並考慮低成本效益之系統晶片測試策略 (頁 88-97)