Hspice Simulation Result - 免衰減操作無自迴授比例式記憶細胞

CHAPTER 3............................................................................................................................ 30

3.2 Hspice Simulation Result

The simulation of T1 and R_ij shown in Fig. 2.7 is shown as Fig. 3.5. When the input voltage is between 0.9V and 2.1V, the transfer curve in Fig. 3.5 is linear. If the input voltage of T1 is smaller than 0.9V or larger than 2.1V, the output voltage is saturated. Thus it is described in chapter 2 that the voltage level 2.1V (0.9V) is defined as +1 (-1). Fig. 3.6 shows the simulation result of T2D. Because the output current of T2D is an absolute current, the flowing direction of the output current is the same when the input voltage of T2D is larger or smaller than 1.5V. The transfer curve of T2D is linear when the input voltage is between 0.9V and 2.1V. The simulation result of COMP is shown as Fig. 3.7. In Fig. 3.7, The input current I_Mss is swept and I_aw is kept as constant. Fig. 3.7 has three rows. The first row is the overall observation of .DC simulation. To observe the dead zone of the COMP, the second row of

Fig. 3.7 is the transfer curve which is zoomed out. In Fig. 3.7, the first and second rows are the transfer curve of Vout in Fig. 2.13, and the third row is the transfer curve of Vout in Fig.

2.13. Fig. 3.7 shows the dead zone of the comparator is about 10nA.

Fig. 3.5 Transferring curve of the V-I converter T1 and Rij

Fig. 3.6 Transferring curve of the V-I converter T2D

Fig. 3.7 .DC Simulation result of comparator

Fig. 3.8 and Fig. 3.9 show the simulation result of the unit gain buffer in Fig. 2.20 and Fig. 2.21. Fig. 3.8 shows the frequency response of the OP in Fig. 2.21 and Fig. 3.9 shows the difference between Vin and Vout of the unit gain buffer in Fig 2.18. Table 3.2 is the specification of the OP in Fig. 2.21.

Fig. 3.8 Frequency response of the OP that performed as unit gain buffer

Fig. 3.9 The voltage difference between Vin and Vout of unit gain buffer

Table 3.2 Specification of the OP performed as unit gain buffer

DC gain 37.2 dB

3dB freq 24K Hz

Unit gain freq 1.8M Hz

Load capacitor 20p

Bias current 800 uA

The Whole chip recognition process is also simulated by Hspice. Because there are 81 pixels, it isn’t feasible to show the learning and recognition process of all pixels. Thus several pixels are shown as examples. All of the pixels are checked and they are all recovered.

Fig. 3.10~Fig. 3.13 show the whole chip learning and recognition process of four pixels.

In Fig. 3.10~Fig. 3.13, circuit learns patterns in “learning period”, and the “pattern transferring” is used to transfer the learning patterns stored in shift register. The timing

“counter” means the counter is counting how many ratio weights are preserved. In “noisy pattern read in”, the noisy pattern that supposed to be recognized is inputted into the circuit.

After the “noisy pattern read in”, the recognition process starts.

It is described in chapter 2 that the pure black voltage level is defined as 2.1V and the pure white voltage level is defined as 0.9V. Fig. 3.10 is the operation process of the second row and the fourth column pixel P(2,4) and Fig. 3.11 is the operation process of P(2,2). P(2,2) is a white pixel with noise, and P(2,4) is a white pixel without noise. When “noisy pattern read in” starts, the voltage level of P(2,4) is between 0.9V and 2.1V. Thus that’s a gray pixel.

When recognition period begins, the voltage level of P(2,4) is pulled blow 0.9V, thus P(2,4) is recognized and recovered. P(2.2) is also pulled blow 0.9V after recognition period. Thus the P(2,2) is recognized too. Fig. 3.12 shows the operation process of P(3,8), and Fig. 3.13 shows the operation process of P(3,2). P(3,8) is a black pixel without noise, and P(3,2) is a black pixel with noise. Similarly, when “noisy pattern read in” starts, voltage level of P(3,2) is between 0.9 and 2.1V. That means P(3,2) is a gray pixel in this timing. After recognition period, this pixel is pulled over 2.1V, and that shows it is recover to a pure black pixel.

Similarly, P(3,8) is pulled over 2.1V too, and it is recognized.

Fig. 3.10 Recognizing process of the white pixel without noise P(2,4) (Hspice)

Fig. 3.11 Recognizing process of the white pixel with noise P(2,2) (Hspice)

Fig. 3.12 Recognizing process of the black pixel without noise P(3,8) (Hspice)

Fig. 3.13 Recognizing process of the black pixel with noise P(3,2) (Hspice)

CHAPTER 4 LAYOUT DESCRIPTIONS AND EXPERIMENTAL RESULTS

4.1 Layout and Experimental Environment Setup

Fig. 4.1 and Fig. 4.2 show the layout of the chip. Fig. 4.1 shows the layout of one cell and two ratio memories. The central part of Fig. 4.1 is cell, and the left side and right side of Fig. 4.1 are ratio memories. The area of one cell and two RM is 400x250 um². Fig. 4.2 shows the whole chip layout. In Fig. 4.2, the TSMC standard pads which include ESD device, pre-driver and post-driver are used. The die area is 4.56x3.49 mm². Fig. 4.3 is the package diagram, and the package is 84 pins LCC84. The die photo is shown as Fig. 4.4. Table 4.1 shows the summary of performance. That performance is compared with RMCNN with elapsed operation[18]. The RMCNN w/o EO is compared with the RMCNN with elapsed operation.

The area per pixel of RMCNN w/o EO is smaller than the RMCNN with elapsed operation, but the whole chip area of RMCNN w/o EO is larger. Because the large TSMC standard pad is adapted in RMCNN w/o EO, the whole chip area is larger even if the area per pixel is smaller.

The environment of measurement is shown as Fig. 4.5. The controlling signals and some input signals are generated by the pattern generator of HP/Agilent 16702A. The clock in the pattern generator is 12.5MHz and the signal rising (falling) time is about 4.5ns. Output waveform is shown on the oscilloscope TDK 3054B. The power supply is 3V.

Fig. 4.1 Layout of one pixel (two RM and one cell)

3.49 mm

4.56 mm

Fig. 4.2 Layout of the whole chip (pad included)

Fig. 4.3 The package diagram

Fig. 4.4 The die photo of 9x9 RMCNN without elapsed period

Table 4.1 the summary of the RMCNN w/o EO compared with RMCNN with elapsed operation

RMCNN with EO RMCNN w/o EO

Technology 0.35 µm 1P4M

Mixed-Signal Process

0.35um 2P4M

Mixed-Signal Process

Resolution 9 x 9 Cells 9x9 Cells

No. of RM blocks 144 RMs 144 RMs

1 Pixels 1 cell + 2 RMs 1 cell + 2 RMs

Single pixel area 350 µm x 350 µm 400 um x 250um

CNN array size (include pads) 3800 µm x 3900 µm 4560 um x 3900 um

Power supply 3 V 3V

Total quiescent power dissipation 120 mW 87mW

Minimum readout time of a pixel 1 µs 100ns

Elapsed operation Require Not require

Fig. 4.5 The environment of measurement

This circuit is controlled by many controlling signals. Fig. 4.6 shows the timing relationship of these controlling signals. The circuit figures in chapter 2 explained how these controlling signals control the circuit. The signals clk1 and clk2 determine the architecture of the circuit. If clk1 is high, the architecture of the circuit is learning architecture which is shown as Fig. 2.4. If clk2 is high, the architecture of the circuit is recognition architecture which is shown as Fig. 2.6. Thus the signals clk1 and clk2 can’t be high at the same time.

Otherwise the circuit can’t operate correctly.

In Fig. 4.6, the learning period is marked in the timing that clk1 is high. Similarly, recognition period is marked in the timing that clk2 is high. Signal R is used to reset the output of some sub-circuits in the circuit. The DFF is used to drive the negative edge trigger D-flip-flop in Fig. 2.22. The signals newp and pin appear in Fig. 2.22. When the newp is low, the connection between shift registers is cut off. Then the data in shift registers won’t be changed by the glitch on signal DFF. When newp is high, the shift registers can transfer the learning patterns. Thus the signal DFF oscillates only when newp is high. Signal pin let the pattern stored in shift register input into cells. After learning period, the ratio weights are generated in the timing “Ratio weight generating”. In this timing, the signals Cou_L and Cou_G which appear in Fig. 2.18 and 2.19 oscillate four times to change the output of Counter_L and Counter_G from “00” to “11” sequentially. Then the paths of Sw_a~Sw_f

and S_en1~S_en6 turn on one by one and the ratio weight will be generated. After the timing

“Ratio weight generating”, the signals noi and pin which appear in Fig. 2.23 become high to input the noisy pattern into cells. Then the circuit starts recognition period to recover the noisy pattern. Table 4.1 shows the function and usage of the all controlling signals.

Fig. 4.6 The control-timing diagram in the measurement of the 9x9 RMCNN with r = 1.

Table 4.1 The function of every controlling signal

Control signal Usage

clk1 High：learning period starts

Low：learning period stops R High：reset the circuit

Low：don’t reset

DFF Drive the shift registers (negative trigger D-flip-flop) used to store the learning patterns.

newp High：the shift register can transfer the learning patterns Low：the shift register can’t transfer the learning patterns pin High：the pattern stored in shift registers input to the cells.

Low：the path between shift registers and cells is cut off Cou_L Drive every local counter in every cell

Cou_G Drive the global counter

clk2 High：recognition period start

Low：recognition period stop

noi High：the pattern in shift registers becomes noisy

Low：isolate the noise and innocent pattern in shift register

4.2 Experimental Result

The output stage is described in chapter 2 and Fig. 2.20. Only one pad is used to output the state of every cell. Thus the 81 pixels are read out sequentially.

Before pattern recognition, the learning function is checked first. That verification of learning function checks if the learning patterns are sent into the shift register exactly and the patterns stored in shift registers input to every cell correctly. The pattern is read out directly after the pattern is inputted into the circuit. Fig. 4.7~Fig. 4.9 is the verified result of learning function. Fig. 4.7 shows the learning pattern ”一” in the shift registers. Fig. 4.8 shows the learning pattern ”二” and Fig. 4.9 shows the learning pattern ”四” in the shift registers. In Fig.

4.7~Fig. 4.9, “Ch 2” is the output data of the chip and “Ch 3” is the LSB of the decoder which controls the switches Swc11~Swc99 in Fig. 2.20. “Ch 1” is a trigger signal, and it is meaningless in this measurement. Each row is read out sequentially. The first row is read out first, and then the second row is following. Each row is marked in Fig. 4.7~Fig. 4.9. The output waveform of “Ch 2” in Fig. 4.7~Fig.4.9 can be cut off and recombined to form a new pattern that is more easily discerned. Fig. 4.10~Fig. 4.12 show these recombined output waveform. Left sides of Fig. 4.10~Fig. 4.12 is the pattern that supposed to be learned, and right side is the recombined output waveform. In Fig. 4.7~Fig. 4.12, the output of black pixel is about 1.5V, and the output of white pixel is about 0.2V.

It is obvious that all of the learning patterns are inputted exactly into the circuit, and the shift register indeed work well. But the measurement of recognition function isn’t so successful. Fig. 4.13 is the recognition result of pattern “四” without noise, and Fig. 4.14 shows the recombined output waveform of Fig. 4.13. It is obvious that some pixels in row 4 and row5 are not pulled up enough. That means these pixels are not recover to pure black of pure white color. The colors of these pixels are just gray. Though the recognition of innocent pattern “四” isn’t successful, however the recognition result of patterns “一” and “二” without noise are very successful. Fig. 4.15 and Fig. 4.16 are the measurement result of recognition of

patterns “一” and “二”.

Fig. 4.7 Experimental verification of learning function (“一”)

Fig. 4.8 Experimental verification of learning function (“二”)

Fig. 4.9 Experimental verification of learning function (“四”)

Fig. 4.10 The recombined waveform of the verification of learning function (“一”)

Fig. 4.11 The recombined waveform of the verification of learning function (“二”)

Fig. 4.12 The recombined waveform of the verification of learning function (“四”)

Fig. 4.13 Experimental recognizing result of the clear pattern “四”

Fig. 4.14 The recombined waveform of the experimental recognizing result of the clear pattern “四”

Fig. 4.15 Experimental recognizing result of the clear pattern “一”

Fig. 4.16 Experimental recognizing result of the clear pattern “二”

The recognition result of noisy pattern with noise level 0.5 is shown as Fig 4.17 and Fig 4.18. Fig. 4.17 is the recognition result of pattern “一”, and Fig. 4.18 is the recognition result of pattern “二”. Both the two noisy pattern is unrecognized. The noisy pattern with noise level 0.5 is unrecognized in simulation result too.

Fig. 4.17 Experimental recognizing result of the noisy pattern “一” with noise level 0.5

Fig. 4.18 Experimental recognizing result of the noisy pattern “二” with noise level 0.5

4.3 Cause of the Imperfect Experimental Result

The cause of the unsuccessful recognition is found in this thesis. Table 4.2 shows the absolute-weight of cell(4,4) which is recognized unsuccessfully. Three simulation conditions are in Table 4.2. The absolute-weight ss₄₄^M is simulated by Matlab, and that is a ideal weight.

The absolute-weight is simulated by Hspice in typical-typical corner condition. The absolute-weight is simulated by Hspice in fast-slow corner condition. The

absolute-weights and are strange. The absolute-weights in practical circuit is stored on the capacitor Cw in Fig. 2.4. The Hspice simulation result shows the charging and discharging currents are unbalanced. It is described in chapter 2 that the ratio weights are generated according to the absolute mean of absolute-weight. Table 4.3 shows the generated ratio weights according to the absolute-weights in Table 4.2. Because of the wrong absolute-weights and , the absolute means of the two absolute-weights are wrong too. Though the mean of is wrong, there is still only one weight that is larger than the mean of . Thus the ratio weight of is the same with the ratio weight of

these two ratio weights are correct. But the mean of ss₄₄^FS is too wrong to get a correct ratio

weights. There are three weights in larger than the mean of , so the generated

ratio weight of is completely wrong. The wrong ratio weight results the wrong recognition. The cause of the wrong absolute-weights is shown as blow.

ssFS ss₄₄^FS

ssFS

Chapter 2 explained the learning structure and the all detailed sub-circuits. Fig. 4.19 is the learning structure. The block W charges or discharges the capacitor Cw according to the input of two neighboring cells, and the charging current direction is controlled by the XOR gate in Fig. 4.20. Fig. 4.20 is a part of Fig. 2.10. The two inputs of XOR gate are the signs of two neighboring cells. Fig. 4.21 shows that one of the two inputs of XOR is connected to the Vin of T2.

When a pattern is learned, the shift registers need to transfer the new pattern. The pattern transferring takes a little time, and the MOS M26 in Fig. 2.8 is turned on in this timing. The MOS M26 in Fig. 2.8 is turned on and let the current I_charge in Fig. 4.19 become very small.

However, this small current still influences the absolute-weights on Cw, and Fig 4.22 shows the small current in the pattern transferring time. Note that there is small current in the pattern transferring time. Because M26 in Fig. 2.8 is turned on in the pattern transferring time, one input of the XOR gate would be Vref(1.5V). Because one input of the XOR gate is connected with 1.5V, the output of XOR is unpredictable. Thus the influence of the small current in the pattern transferring timing is out of control, and the absolute-weights are affected by the small current.

The modified circuit is shown as Fig. 4.23. A new path connected with a dummy load is inserted. The path turns on when patterns is transferring, and then the small current in the pattern transferring timing doesn’t influence the absolute-weights. Fig 4.24 is the simulation result of modified T2D, and it shows the modified design of T2D doesn’t contribute a small current to Cw. One pixel model with modified T2D is simulated too, and the modified design can indeed recognize the noisy pixel.

Table 4.2 The absolute weight of cell(4,4) in three simulation condition

Simulation condition Absolute-weight of cell(4,4)

Matlab (ideal)

Table 4.3 The absolute mean and generated ratio weights of cell(4,4) in three simulation condition

Simulation

condition Absolute-weight of cell(4,4) Mean Ratio weights of cell(4,4)

Matlab

Fig. 4.19 The absolute-weights learning structure

Fig. 4.20 The structure that controls flowing direction of I_charge

Fig. 4.21 The connection between T2 and input of XOR gate

Fig. 4.22 The integration of T2D output current and time

Fig. 4.23 The modified circuit

Fig. 4.24 The integration of T2D output current and time 1) the modified design 2) the original design

Fig. 4.25 Simulation result of one cell model 1) the original design 2) the modified design

CHAPTER 5 CONCLUSION AND FUTURE WORK

5.1 Conclusion

A new circuit of RMCNN w/o EO is implemented. The new circuit has the same recognition rate with RMCNN with elapsed operation, but the operation of RMCNN w/o EO is simpler.

The new RMCNN w/o EO doesn’t need a elapsed period to get the feature enhanced ratio weights. The RMCNN w/o EO can generate the feature enhance ratio weights directly after pattern learning, and it has a good recognition rate that is the same with RMCNN with elapsed operation. Though the operation of the RMCNN w/o EO is simpler, the circuit of RMCNN w/o EO isn’t complicated. The RMCNN w/o EO doesn’t need the multi-divider (M/D)[18] in the RMCNN with elapsed operation. Thus the transistors count of RMCNN w/o EO is less than the RMCNN with elapsed operation. Besides, there is a division behavior in RMCNN w/o EO, but there isn’t any divider in the circuit. Thus the hardware of RMCNN w/o EO is simple.

The number of the learning patterns of RMCNN w/o EO is 3. They are Chinese characteristic one , two and four (一, 二 and 四). Maximum standard deviation of normal distribution noise is about 0.3. The number of learning patterns that RMCNN w/o EO can remember is still few. To increase the number of learning patterns that can be remembered, we should modify the learning algorithm or the recognizing algorithm continuously in the future.

In the experimental result, some vertical lines of pattern “四” are unrecognized. The recognition results of patterns “一” and “二” are successful and all lines in pattern “一” and

“二” are horizontal. That doesn’t mean the recognition rate of horizontal lines is better than vertical lines. The failure of recognition result dues to the ratio weights around one pixel and

the inputs of neighborhood pixels. If the ratio weights around one pixel are wrong, the recognition fails even that pixel is on horizontal line. Thus the failure of recognition will appear in other patterns like “五” if the wrong ratio weights are generated. Fig. 5.1 shows some examples that recognizing failure may happen and not all of the failure examples are vertical lines.

Fig. 5.1 Examples of Recognizing failure 5.2 Future Works

The RMCNN w/o EO in this thesis can’t recognize all of the three patterns. The cause is found and the circuit is also redesigned in this thesis. Simulation supported that the modified design can really recognized all of the three patterns. Thus the RMCNN w/o EO should be taped out again. To reduce the chip area, the routing of RMCNN w/o EO should be modified too.

There are some modifying methods for the next chip are proposed in this thesis

在文檔中免衰減操作無自迴授比例式記憶細胞 (頁 45-0)