CHAPTER 2 REVIEWS OF RETINAL PROCESSING
2.2 REVIEW OF FOCAL-PLANE MOTION SENSORS
Since motion provides rich information in understanding the environment, thus much research effort has been devoted into topics related to motion detection. In the conventional image processing system, it will create a heavy computational load for system even to perform the simplest task which human visual system can do. Therefore it has great difficulties to meet the requirements, which include inexpensive, low-power, compact, operation with a wide optical dynamic range, and working in real time, in the application for portable devices. There is a solution which is using application specific integrated circuits (ASICs) to overcome the heavy load of computation. In this approach, data transmission is still a bottleneck. Focal-plane image processing approach, which integrates the image acquisition and processing into a single chip [13], is therefore
developed to speed up data transmission between main function blocks.
The close interaction between image acquisition and processing units is the most salient feature of focal-plane image processing. Three levels of integration can be realized in focal-plane image processing: that is, processor per pixel, processor per column, and processor per chip. It is no doubt that the architecture of processor per pixel will provides the fully parallel and fastest image processing, but it may not be the optimum solution considering along with pixel area, fill factor, and power consumption.
To detect motion, the spatio-temporal processing has to be performed since motion analysis is related with both space and time. For the spatial image processing, each pixel has to interact with its neighboring pixels to extract spatial information. To perform the temporal image processing, image has to be stored in memory elements in a discrete-rime circuit while image is stored in delay elements in a continuous-time circuit. The number of memory or delay elements required in each pixel equals the number of past image frame needed in the processing. In most real implementations it is only practical to use one past image frame to prevent too many memories used.
The advantages of focal-plane motion processing include:
(1) Speed and parallelism: Focal-plane image processing has higher achievable processing speed than that of camera-processor combination. The reason is the
parallel structure between the imager and the processing unit and thus increased the data transfer bandwidth along with reduction of the data to be transferred.
(2) Size: The close interaction between the image acquisition and processing units allows a single chip implementation and achieves a compact system. Obviously, the size is smaller than the conventional implementations.
(3) Power dissipation: The focal-plane image processing makes use of analog circuits operating in the subthreshold region, the weak operating currents result in low power dissipation.
(4) System integration: The focal-plane image processing may comprise most
the trend in the future, this is a great advantage over the camera-processor option.
(5) High optical dynamic range: With the capability of spatio-temporal image processing, the focal-plane image processor can extract the average light intensity of the image and adjust the intensity of output to be adaptive to the average light intensity. Therefore, a high optical dynamic range can be achieved.
Ph1 Ph2
-Fig. 2.7 The bi-directional correlation model
To detect motion, the motion detection algorithms adopted in focal-plane image processor include the gradient or intensity –based algorithm, the token-based algorithm, the energy-based algorithm, and the correlation-based algorithm. In recent years, studies have shown that correlation-based processes are responsible for motion detection in many animals, including humans. As addressed in Section 1.1, the correlation-based algorithm has superior features, that is, robustness and compactness. Therefore, this section puts more emphasis on the discussion of the correlation-based algorithm. Fig. 2.7 shows a correlation-based algorithm model which performs bi-directional correlations. Ph is the photo-circuit, D is a time delay element with a time delay, and C is the correlator. The
delayed signal is sent to the correlators in the neighboring pixels along preferred positive and negative direction paths. The sign of the output signal of the correlator depends on the direction of motion. When the object moves from Ph1 to Ph3 (Ph3 to Ph1), the correlator C2 outputs a positive (negative) signal. To make the output robust, the photo-circuits usually incorporate with certain spatial or/and temporal processing capability. Thus, the correlation is often performed on the features, e.g. edges, rather than the image itself.
In the following, some focal-plane image sensors are reviewed.
(1) A CMOS Focal-Plane Motion Sensor With BJT-Based Retinal Smoothing Network And Modified Correlation-Based Algorithm [1]
Fig. 2.8 Structure of a single pixel. Reprinted from [1]
In this design, it uses retinal processing circuits to sense image and further detect preferred global motion based on the modified correlation-based algorithm. Both
techniques on circuit implementation can overcome the delay time accuracy, limited tunable range of delay time and larger pixel area when both spatial and temporal edge detectors applied in conventional image sensors.
Fig. 2.8 is the structure of a single pixel used in this proposed sensor. Each pixel of the 32 x 32 array includes the BJT-based retinal processing circuit, two registers, five correlators, and five shift registers. After sensing image tested, the output of the retinal processing circuit is then sent to CF, which memorizes the current frame, and PF samples the data from CF after a clock cycle controlled by sample in Fig. 2.8. The output of CF is then correlated with the output of PF and the outputs of PF from its immediate neighbors. The outputs of the five correlators are next sent to shift registers to be accumulated by back-end accumulators. The shift register can receive the output of the correlators if the control signal is high while it can receive the data from the left pixel if the signal is low.
Fig. 2.9 Structure of the BJT-based retinal processing circuit. Reprinted from [1]
The implementation of the BJT-based retinal processing circuit is depicted in Fig.
2.9. An isolated photo-BJT is used as the photoreceptor, a smoothing photo-BJT with adjustable N-channel MOSFET resistors is used to form the retinal smoothing network, and a current-input CMOS Schmitt trigger and two inverters are included. The base of the smoothing photo-BJT is connected to the photo-BJT Q1’s four nearest neighbors, via the N-channel MOSFET resistors Msrd and Msrr, whose resistance is controlled by the gate voltage Vsmooth, forming the smoothing network. The transistors Mins, Mips, Mn and Mini, Mipi, Mp are used to virtually bias the emitters of Q1 and Q2 at Vbias. Mip, Min, and Mc are common to all pixels. The current-input CMOS Schmitt trigger comprises Ms, Msr, Mi, Mir, Mf1, and Mf2 transistors. The voltage Vf is used to adjust the threshold level.
The transistors Mp1, Mp2, Mp3, and Mp4 are used to mirror the emitter current of Q2 to the current-input CMOS Schmitt trigger. The inverters A and B amplify the output of the current-input CMOS Schmitt trigger to VDD or VSS so that the signal is converted from analog to binary.
The modified correlation-based algorithm is adopted in the proposed motion sensor.
All the outputs of the correlator along the preferred direction are accumulated by the accumulator throughout the array to determine the correlation output. The correlation output is used to calculate the displacement in a sampling period. In the proposed motion sensor, the output of PF is shifted to the four nearest neighbors in the +x, -x, +y and –y directions so that the displacement along these four directions can be determined.
The corresponding correlation outputs are C(+x), C(-x), C(+y), and C(-y), respectively.
The correlation output between the output of PF without shift and the output of CF as defined by C(no) is also determined and is used to calculate the displacement. The five kinds of correlation outputs are further averaged over 16 sampling periods and are defined as Ca(+x), Ca(-x), Ca(+y), Ca(-y), and Ca(no), respectively. The calculated displacement Δx in the +x or –x direction normalized to the distance between two adjacent pixels can be expressed by the averaged correlation outputs as
(2.1)
Similarly, displacement Δy in the +y or –y direction can be expressed as
(2.2)
The direction of motion can be determined by comparing the amplitude of Ca(+x) with that of Ca(-x) as well as Ca(+y) with that of Ca(-y). The directional angle is given by
If the displacement and direction of motion is calculated without using the averaged correlation outputs, the pattern-related deviations are significant. Besides, this also helps to express the displacement between 0 and P. Since each pixel is correlated only with the nearest neighbors, it is a criterion that the displacement per frame of the image under test cannot exceed P.
Notably, during shifting of the previous image frame, the boundary pixels are fixed at zero for simplicity. Therefore, the correlation at the boundary yields errors when calculating Ca(+x), Ca(-x), Ca(+y), and Ca(-y). The boundary condition, however, does not influence the calculation of Ca(no) since no shifting is introduced in the calculation of Ca(no). To calibrate this boundary condition, two factors b1 and b2 are added to (2.1) and (2.2).
(2.4)
(2.5) The value of b1 is chosen to be half the number of pixels at one side of the boundary while b2 is half of b1.
Fig. 2.10 illustrates the architecture of the proposed focal-plane motion sensor, which includes the 32 x 32 pixel array and the peripheral circuits, including five sets of 6-b accumulators for each row, and five sets of 11-b accumulators. The data in the five 11-b accumulators are read out as C(+x), C(-x), C(+y), C(-y), and C(no), respectively.
The average, displacement, and direction are calculated off-chip by software.
Fig. 2.10 The architecture of this proposed focal-plane motion sensor. Reprinted from [1]
This work presents a real-time CMOS focal-plane motion sensor that uses the BJT-based retinal processing circuit and a modified correlation-based algorithm. The correlation-based algorithm is modified and therefore applied to calculate the velocity and direction. The BJT-based retinal processing circuit is used to acquire images and enhance contrast. The presented motion sensor greatly reduces the deviation of the calculated displacement and direction for different image pattern by averaging correlation results over 16 frame-sampling periods
(2) Analysis and Design of a CMOS Angular Velocity- and Direction-Selective Rotation Sensor With a Retinal Processing Circuit [8]
In this design, the CMOS focal-plane rotation sensor uses the same retinal processing circuit with [1] to preprocess the incident image and the correlation-based algorithm to detect the local motion vectors.
Fig. 2.11 The architecture of the proposed rotation sensor, in which consists of 104 pixels and forms five concentric circles. Reprinted from [8]
In most of the focal-plane motion sensors, the spatial distribution of the pixels forms regular and rectangular pattern and the correlation is performed between the two adjacent pixels to get the local motion vectors. This structure, however, has difficulty to extract more complex motion vectors than those in translation, such as Elementary Flow Components (EFCs). To find the solution to extract the information of rotation, the pixels of the proposed rotation sensor are placed in a polar structure to detect the global
rotation direction and velocity of the rotating images. Fig. 2.11 shows the architecture of the proposed rotation sensor, which consists of 104 pixels and forms five concentric circles. Every pixel is correlated with the clockwise and counterclockwise pixels that are 45o apart. The clockwise or the counterclockwise correlation results of all the pixels in the same circle are sent to MLE, which is implemented by a NAND gate, to determine the velocity and direction of the global rotation. There are two sets of MLE, corresponding to clockwise and counterclockwise rotation detection, for a single circle.
Fig. 2.12 The pixel structure of the rotation sensor. Reprinted from [8]
Fig 2.12 shows the structure of a single pixel. Each pixel consists of a retinal processing circuit, two registers, two correlators, two P-channel MOSFETs, which are parts of NAND gates as the MLE. This structure is similar to that of [1] addressed previously except the back-end processing circuit to deal with ouputs of correlators. In this rotation motion sensor, it uses MLE, which is implemented by a NAND gate, instead of accumulators. Fig.2.13 illustrates the structure of MLE with fan-in number
equal to the total number of pixels in a circle. The p-channel MOSFETs of the NAND gate are located inside each pixel, i.e. Mmlecc and Mmlec in Fig. 2.13. The outputs of correlator in each pixel are sent to the gates of Mmlecc and Mmlec as shown in Fig. 2.13.
The drains of Mmlecc (Mmlec) of all the pixels in the same circle are connected together to the diode-connected N-channel MOSFET Mnload as the load to generate the counterclockwise (clockwise) MLE output. The output of MLE is at logic 0 if the outputs of all correlators are at logic 1, which means the local motion vector is detected.
Fig. 2.13 The structure of MLE, in which is implemented by NAND gate. Reprinted from [8]
To calculate the velocity of rotation, the times that the output of MLE is at logic 0 are recorded within 80 clock cycles and a parameter R is defined as
number of logic 0 of MLE output (%) 100
R = × 80 (2.6)
R reaches maximum if the image is rotated with the selected angular velocity and decreases if angular velocity is deviated from the selected one. The relationship between R and angular velocity ω is related with the number of pixels in a circle as well as the number and the position of the edges of the image.
The advantageous characteristics of the proposed rotation sensor are high dynamic range, real-time image processing, and a wide-range of detectable angular velocity. The
proposed rotation sensor is appropriate for applications like the real-time and remote detection of the rotation of automobile engines and wheels, motors, microscopic rotating images, etc.
(3) A Low-Photocurrent CMOS Retinal Focal-Plane Sensor With a Pseudo-BJT Smoothing Network and an Adaptive Current Schmitt Trigger for Scanner Applications [14]
Fig. 2.14 Pixel structure of the proposed retinal focal-plane sensor circuit. Reprinted from [14]
The BJT-based retinal sensor chip mimics parts of functions of the cells in the outer plexiform layer of the real retina. As in the real retina, the retinal sensor chip has similar advantageous features, such as high noise immunity, edge enhancement, and high dynamic range. Therefore, the BJT-based retinal structure has been proven that it is very compact and suitable for VLSI implementation. However, the parasitic p+-n-well-p-substrate BJTs used in the BJT-based retinal sensor have a smaller current
gain when the N-well CMOS technology scaling down to 0.25 µm or below as well as the chip area of the parasitic BJT is large as described in Section 2.1. To solve the problems mentioned above, a new circuit structure is developed and called the pseudo-BJT (PBJT) [12]. By incorporating Pseudo-BJT along with adaptive current Schmitt trigger, it not only solves the problems which the parasitic BJT has but in addition has advantages of low operational photocurrent levels (pA) and robust noise immunity. Besides, this proposed retinal sensor operates in the subthreshold region.
Therefore, the circuit just consumes little power during in nonlighting mode. The proposed structure enhances noise immunity and eliminates disturbances.
In the proposed pixel structure of a retinal focal-plane sensor, as shown in Fig. 2.14, an isolated PNP pseudo-BJT is used as photoreceptor, a smoothing NPN pseudo-BJT with adjustable N-channel MOSFET resistors is used to form the retinal smoothing network, an adaptive current Schmitt trigger, and an inverter are included. The transistors Mp1, Mp2, and photodiode are as the PNP pseudo-BJT. The transistors Mn1, Mn2, and a photodiode are as the NPN pseudo-BJT with four adjustable N-channel MOS resistors Ms1-Ms4 as the smoothing network. The smoothing network is connected to its four neighbors, and the resistance of four MOS resistors is controlled by the gate voltage Vsmooth (VF). The adaptive current Schmitt trigger is composed of Mp1, Mp2, Mn1, and Mn2, and hysteresis level adjustment Mpf1, Mpf2, Mnf1, and Mnf2. The current Iiso is the generated current of PNP pseudo-BJT whereas the current Ismt is that of NPN pseudo-BJT. The operation principle of the adaptive current Schmitt trigger is introduced as follows: initially, if the current Ismt is bigger than the current Iiso, the Vout (Retina_out) goes too high and it turns on the MOS Mnf2 to draw the current ∆I/2. By the same token, the current Ismt is smaller than the current Iiso at first, the Vout
(Retina_out) stays low level and it turns on the MOS Mpf2 to sink the current ∆I/2.
Therefore there is a ∆I current hysteresis. If the induced photocurrent is larger, the current of transistors Mn1 and Mp1 becomes also larger. Due to the function of current mirror Mn1-Mnf1 and Mp1-Mpf1, the current of Mnf1 and Mpf1 could be adjusted by the
induced photocurrent. Hence, this proposed circuit could adjust the current adaptively without external controlling voltage. The transistors Mp and Mn are composed of an inverter to amplify the output of the adaptive current Schmitt trigger to VDD or GND so that the signal is converted from analog to binary.
CHAPTER 3
CIRCUIT DESIGN AND SIMULATION RESULTS
3.1 DESIGN CONSIDERATION
As mentioned in Section 1.3.1, some neurons in Medial Superior Temporal (MST) area of the visual cortex are selective to rotation, expansion/contraction, combinations of these stimuli, and translation in a given direction in the cortical neuron of the monkey.
It has been observed that another group of neurons in the fundus of superior temporal sulcus visual area (FST) of visual cortex is sensitive to shear. While encountering complex non-local motions, the neurons in both visual cortex areas integrate the estimation of translation motion from the middle temporal (MT) area to detect these motions, such as global translation, expansion, rotation and shear.
In conventional focal-plane motion sensors, however, the arrangement of the pixels is regular and forms a rectangular pattern. Also, the correlation is performed between the four immediate neighbors to obtain the local motion vectors. These sensors are designed for translation [1] but not suitable for EFCs detection because of their special motion model. In order to fit shear motion model, the pixels of this proposed shear sensor are placed along the shear-motion paths to ensure the detection of shear-motion. Every pixel is correlated with the shear and reversed-shear pixels that are neighbored apart in the same path, which will be shown in the next Section. Based on the motion computation method of modified correlation algorithm [1], shear motion can be detected and dismiss other motions. Displacement and velocity are also determined by computation of summations of all correlation outputs.
In the proposed shear sensor, the function of image acquisition and preprocessing can be performed by using the pseudo-BJT-based retinal processing circuit with an adaptive current-input Schmitt trigger [14]. It has the advantages of pseudo-BJT as well as the robust noise immunity, high dynamic range and contrast-enhanced.
3.2 CORE CIRCUIT REALIZATION
Fig. 3.1 The pixel structure of the shear sensor
Fig. 3.1 shows the structure of a single pixel. Each pixel consists of a retinal processing circuit, two registers, and three correlators. As shown in Fig. 3.1, after preprocessing by the retinal processing circuit, the register CF samples and stores the
Fig. 3.1 shows the structure of a single pixel. Each pixel consists of a retinal processing circuit, two registers, and three correlators. As shown in Fig. 3.1, after preprocessing by the retinal processing circuit, the register CF samples and stores the