hCNN-based Retinex Model - Implementing Existed CNN-based Application on hCNN

3 Hexagonal-type Cellular Neural Network Model for Preliminary Visual Processing

3.5 Cellular Neural Networks for Hexagonal Image Processing

3.5.4 Implementing Existed CNN-based Application on hCNN

3.5.4.4 hCNN-based Retinex Model

0 0

Gabor, x , y

hCNN _{ω ω} u k . (3-55)

3.5.4.4 hCNN-based Retinex Model

Light intensities in an image carry information about the reflectance properties of surfaces from which light was reflected. However, sometimes the condition of luminance makes great effect on the obtained image. Therefore, several methods have been proposed to estimate and separate the effect of illumination, such as contrast stretching, histogram equalization, etc. In 1985, an effectual method named Retinex theory was proposed to attack this problem successfully [41, 42]. It was inspired by the receptive field structures of neurophysiology. Therefore, the Retinex theory can not only be used in gray-scale image processing, but also has been applied on color environment and obtained many good results. The first Retinex algorithm for lightness computations was proposed by E. H. Land and McCann in [43]. It was based on the model of the lightness and color perception of human vision system [41]. It deals with compensation for illusion effects in images. The primary goal is to decompose a given image S into two different images, the reflectance image R and the illumination image L , such that, at each point

( )

^{x y}^, in the image domain,

( )

( ) ( )

^, ^,

S x y =R x y L x y⋅ . The benefits of decomposition include the possibility of removing illumination effects of back/front lighting and enhancing shots that include spatially varying illumination such as images that contain indoor and outdoor zones [41].

Zarándy proposed a method to implement Horn’s Retinex model based on CNN [44] using the

Although Horn suggests hexagonal structure for the 2D Retinex model, this algorithm implemented Horn’s model on rectangular-type arrangement. Since processing the image in the hexagonal type has many advantages because it can provide more compact structure, it is desirable to implement the 2D Retinex model on hCNN. The proposed 2D Retinex model based on hCNN can be designed as follows. As Zarándy mentioned, the CNN-based Retinex can be separated into two steps (see Figure 3-12). The first step processes the input image using the hCNN with the following templates:

Then a threshold is applied using the hCNN with the following templates:

1 1

According to Horn, the most suitable sampling method for the Retinex model should be the hexagonal-type method. Hence the proposed hCNN can be used to implement Horn’s Retinex model quite perfect as demonstrated in Figure 3-12.

50 3.6 Experiments

In this section, we use some examples to show the advantages of the hexagonal-type methods and the proposed hexagonal-type CNN. First, we transfer an input image into both the rectangular-type image and hexagonal-type image, and then, we compare the Peak Signal and Noise Rate (PSNR) between the input and out images. A 1024x1024 Grey Level Lenna image (Figure 3-14) was chosed as the input image. The PSNR can be represented as follows

( , ,)² the images. The PSNR of a hexagonal-type image is 24.7096, and the PSNR of a rectangular-type image is 24.1634. This fact indicates that the hexagonal-type sampling method can provide higher performance.

The comparison between the frequency response of the rectangular-type image and hexagonal-type image has also been analyzed. The testing target images are shown in Figure 3-14.

Figure 3-15 shows the results of the comparisons, where Figure 3-15(a) is the error of frequency response in a specific direction between the original image and the rectangular-type images. Figure 3-15(b) is the error between the original and hexagonal-type images. Clearly, the error between the original image and hexagonal-type image is smaller than the other, in higher frequency especially.

In the experiments, we implemented hCNN in MATLAB. The results can be compared with those of the well-known MATLAB toolbox for CNN, i.e., MatCNN. We implement a few image processing functions on both CNN and hCNN for comparisons. Figure 3-16 shows the results, where Figure 3-16(a), (b) are the input images for the rectangular-type CNN and the hexagonal-type

CNN respectively, and Figure 3-16(c), (d) are the output images processed by the rectangular-type CNN and the hexagonal-type CNN, respectively, which performed the CONTOUR_DETECTION function. The templates for the rectangular-type CNN are as follows

0 0 0 1 1 1

and the templates for the hexagonal-type CNN are as follows

0 0 1 1

Figure 3-16(e), (f) are the output images processed by the rectangular-type and hexagonal-type CNNs, respectively, which performed the “SMOOTHING with BINARY OUTPUT” function. The templates for the rectangular-type CNN are as follows

0 1 0 0 0 0

and the templates for the hexagonal-type CNN are as follows

1 1 0 0

1 2 1 , 0 0 0 , 0

1 1 0 0

A= B= I = , (3-63)

and Figure 3-16(g), (h) present the BINARY_THRESHOLDING function. The templates for the rectangular-type CNN are as follows

and the templates for the hexagonal-type CNN are as follows

0 0 0 0

0 2 0 , 0 0 0 , 0

0 0 0 0

A= B= I = . (3-65)

We also implemented a well-known CNN function, “GLOBAL CONNECTIVITY DETECTION”.

The templates for the rectangular-type CNN are as follows

0 0.5 0 0 0.5 0

and the templates for the hexagonal-type CNN are as follows

0.5 0.5 0.5 0.5

Figure 3-17 shows the examples and results of this function, where Figure 3-17(a) is the input image of general CNN, Figure 3-17(b) is the input image of hCNN, Figure 3-17(c) is the initial state of general CNN, Figure 3-17(d) is the initial state of hCNN, Figure 3-17(e) is the output image of general CNN, and Figure 3-17(f) is the output image of hCNN.

3.7 Discussions

In this section, we proposed the hexagonal-type CNN (hCNN) for Hexagonal Image Processing

(HIP). HIP contains many advantages; however, some disadvantages still remain. For example, to analyze and compute the processing procedure in HIP is still a difficult task. In this section, we introduced an approach to overcome those disadvantages based on CNN. And some examples have been given to demonstrate how CNN works in Hexagonal Image Processing, called hCNN. In hCNN, each cell connects to only six neighbor cells, but not eight. Consequently the designs of the integrated circuits of CNN-based image processors are able to be simplified. On the other point of view, one of the most important features of CNN is about the nonlinear template design. We hope it will prove useful for the design of nonlinear processing of Cellular Neural Networks in Hexagonal Image Processing.

Figure 3-1. Distribution of cones on the retina of mammalian.

(a) (b) (c)

Figure 3-2. The comparison between different layout of CNNs, where (a) is rectangular 8-connected rectangular-type CNN, (b) is rectangular 4-connected rectangular-type CNN, and (c) is hexagonal 6-connected hexagonal-type CNN.

Figure 3-3. The confliction of connection in diagonal direction of 8-connected CNN.

Figure 3-4. Hexagonal-type image sampling and indexing scheme.

Figure 3-5. An example of hexagonal-type CNN implementation based on rectangular-type CNN. For each cells, the upper-left and lower-right direction of synapses has been disconnected.

56 Figure 3-6. The sampling method of proposing model.

Figure 3-7. The geometrical relationship among pixels, where the error of distance between different directions is about 11.8%.

(a) (b)

Figure 3-8. The sampling pixel we used in this section, where (a) is rectangular-type pixel, and (b) is hexagonal-type pixel.

(a) (b)

Figure 3-9. The comparison of artificial images between the image which has been constructed by square-shape pixels and the image which has been constructed by hexagonal-shape pixels.

(a) (b)

(e) (f)

Figure 3-10. The example of different type of sampling methods, where (a) is the rectangular-type image, (b) is hexagonal-type image, (c) and (e) show the details of (a), (d) and (f) show the details of (b).

(a) (b)

Figure 3-11. The processing result of hCNN-based Gabor-type filtering, where (a) is the input hexagonal-type image and (b) is the output. Note that the resolutions of the images have been reduced and therefore the hexagonal pixels can be exposed.

Figure 3-12. The schematic of Horn’s model using summing and threshold elements. The feedforward structure calculates the Laplace operator, while the feedback loop structure calculates the inverse Laplace operation. Note that for the shake of clarity of the figure not all the feedback and feedforward interconnections are indicated.

This figure has been obtained from [45].

(a) (b)

Figure 3-13. The processing result of hCNN-based Retinex-model, where (a) is the input hexagonal-type image, and (b) is the output.

(a) (b) (c)

Figure 3-14. The example of different type of sampling methods, where (a) is the original image, (b) is the image which has been sampled by the rectangular-type method, and (c) is the image which has been sampled by the hexagonal-type method.

(a) (b)

Figure 3-15. The result of comparisons, where (a) is the error between the original image and the rectangular-type image, and (b) is the error between the original and hexagonal-type image. Clearly, the error between the original image and hexagonal-type image is smaller than the other.

(a) (b)

(e) (f)

(g) (h)

Figure 3-16. Experimental results were shown in this figure, where (a), (b) are the input images for the rectangular-type CNN and the hexagonal-type CNN respectively. First, (c) and (d) are the output images processed by the rectangular-type CNN and the hexagonal-type CNN, respectively, where they present CONTOUR_DETECTION function. Next, (e) and (f) are as same as the above and they present SMOOTHING_with_BINARY_OUTPUT function. Finally, (g) and (h) present BINARY_THRESHOLDING function.

(a) (b)

(e) (f)

Figure 3-17. The results of GLOBAL_CONNECTIVITY_DETECTION, where (a) is the input image of general CNN, (b) is the input image of hCNN, (c) is the initial state of general CNN, (d) is the initial state of hCNN, (e) is the output image of general CNN, and (f) is the output image of hCNN.

4 CNN-BASED COMPUTER FOVEA MODEL

In this section, first the biological structure of a retina must be understood. Five major types of neurons exist in the five layers of the retina.Outer nuclear layer contains photoreceptors, Inner nuclear layer contains horizontal cells, amacrine cells and bipolar cells, and ganglion layer contains ganglions. Moreover, Outer plexiform layer contains the synapse connections among the photoreceptors, the horizontal cells and the bipolar cells. Finally, inner plexiform layer contains the synapse connections among the bipolar cells, the amacrine cells and the ganglions [1, 46].

The HVS contains four kinds of photoreceptor: L-cone, M-cone, S-cone and rod cells. These different types of photoreceptors react to different wavelengths of light (Figure 4-1). Rod cells can sense light intensity, while L-cone, M-cone, S-cone cells can detect color information. Sometimes the cooperation of these cone cells can also detect the light intensity. We are already aware that The ganglion is usually activated by a set of the photoreceptors via other bipolar, horizontal, and other types of cells [2, 47]. This set contains several varieties of photoreceptors. The differences among those different types of photoreceptors results in the variation among the ganglions. The two main types of ganglions are center-on/surround-off and center-off/surround-on. Figure 4-2 illustrates two examples, where Figure 4-2(a) shows an example of center-surround ganglion. The center of a group of photoreceptors reacts to the stimulation differently to the peripheral part of the group. The ganglions can generally be classified as red-green (RG) ganglion, blue-yellow (BY) ganglion, and black-white (BW) ganglion (see Figure 4-2(b) and (c)) and others [1].

Ganglion is important in the human vision system and processes most visual information.

Visual information includes light intensity and color information. The mechanism which is used to process this visual information in the human vision system is so-called early vision system, also known as the pre-attentive vision system. The early vision system represents a set of the first stage

information processing mechanisms of visual processing. Those mechanisms are operated in parallel across the visual field, and are believed to be used for detecting certain fundamental visual features [1].

According to Jain and Farrokhnia et al., the human vision system possesses two fundamental features: first, in some respects the retina acts like a low-pass filter. Generally, the result of the low-pass filtering represents an average intensity of light for a specific local area. The result of this operation is termed “the first order feature.” Second, a difference exists between the intensity of the external light and that projected into the retina. According to some studies, the boundary detection operation in the human vision system is based on this kind of feature, and is related to the zero crossing of a Laplacian of two Gaussians (LoG). Sometimes the result of this operation is called

“the second order feature [48].”

The mechanisms of information processing in the retina are rather complex and remain unclear.

This study thus mainly aims to design and approximate the Receptive Fields (RFs) of the cells on a retinal fovea. From the signal processing perspective, the RFs can be referred to as a finite impulse response of a spatial filter [4, 47, 49]. In this investigation, the well-known Cellular Neural Network (CNN) is used to realize the spatial filter.

CNN represents the next generation of computational architecture. Similar to a biological system, each cell in the CNN can communicate only with its immediate neighborhood.

Consequently, using the CNN to implement the proposed model makes sense. According to some previous studies, hexagonal image processing is much reasonable for image processing, particularly for bio-inspired models (refer to Figure 4-1) [12, 15, 20, 27, 45]. Thus, this study suggests a special type of CNN) ― the hexagonal-type Cellular Neural Network (hCNN). Furthermore, hCNN provides a means of reducing implementation problems without increasing the complexity [12].

Based on the biological investigations and the abilities of the CNN, this work proposes an

hCNN-based computer fovea model. The proposed model simulates certain biological mechanisms of the retina and the fovea, including the photoreceptors, bipolar cells, horizontal cells, ganglions, and the co-operations of those cells in a fovea and a retina. Consequently, some properties of the human vision system can be simulated. The human vision system possesses various interesting properties. Some of those properties can even be used to enhance the visual information. This study also presents how these properties provide visual enhancement.

This study first briefly introduces the hCNN, and also discusses the stable central linear system for the hCNN and develops its implementations. Those implementations are required for the proposed model, including CNN-based Laplace-like operators, CNN-based Gaussian-like operators and their inverse operators. Subsequently, the CNN-based computer fovea model is introduced.

Building on the above, several experiments are presented, including vision enhancing algorithms based on this model. Finally, conclusions are drawn.

4.1 Modeling the Biological Structures of the Cells in the Fovea and the Retina

Since a retina is a highly structured network of neurons, it has become a valuable research topic in the field of Human Computer Interaction (HCI). Many studies have studied the structure of the retina and the biological evidence regarding the functions of the human vision system [50, 51].

Notably, some researchers have even analyzed visual information processing in the retina [4, 47].

Building on these previous works, this study constructs a new computer fovea model.

The retinal fovea is located at the center of the retina, and is the region with the highest visual acuity. The fovea is a 0.2~0.4 mm diameter rod-free area with very thin, densely packed photoreceptors. The photoreceptors in the fovea are arranged in a roughly hexagonal pattern (see Figure 4-1(a)) [3], and the average cone spacing (csp) has been estimated at around 2.5 to 2.8 μm, where has been considered as the most important area in a retina. The retinal fovea is directed towards the desired object of study. The retinal fovea almost exclusively contains high density

65 cones.

Figure 4-5 shows the proposed hCNN-based computer fovea model, where Figure 4-5(a) illustrates the top-view of the computer fovea. The computer fovea is constructed using a set of photoreceptors, which are hexagonally arranged (see Figure 4-5(b)). Figure 4-5(c) is the signal processing system of the cells in the fovea model. Thiem mentioned that because there are direct synaptic connections between bipolar cell and the ganglion, and only few influences by the amacrin cell in the fovea. Hence, in his research, he suggested that the bipolar cell and the amacrin cell are neglected [5]. However, the horizontal cells are already known to exist and connect directly to the bipolar cells [47]. Thus, this study suggests keeping the bipolar cell and the ganglion in the system, and considering the ganglion as a direct synaptic connection.

A simplified version of the proposed model is required to obtain the parameters of the proposed architecture. The simplified version of the proposed model ignores the differences between the L-cone, the M-cone, and the S-cone cells. Restated, the system considered for obtaining the parameters is assumed to be monochromatic. Thus, x_R′ equals _′ x . The following _R sections, discuss behavioral variation among the cone cells.

4.1.1 Ganglion

The ganglions of the mammalian retina have been characterized into X-, Y-, and W-types based on the spatiotemporal distribution of the excitation and inhibition. The cells are classified according to their latency from the optic chiasm stimulations [47]. The X-type ganglions behave linearly in both the space and time domain and also provide the best spatial resolution, but the most ganglions are considered direct synapse connections. Consequently, in this study, the ganglion has been considered as an amplifier which contains a bias g .

Based on the above, h_G( )k can take the form

Some physiological experiments indicated that the RF of the ganglion exhibits a center/surround characteristic. Furthermore, Thiem stated that the RF of ganglions can be modeled as follows [5]

( ) (

( ) )

hG k = Δ g_σ k , (4-2)

where ( )Δ ⋅ represents the Laplace operator, and g_σ_G( )⋅ is a Gaussian function. According to Hubel, under the optimum lighting condition, the central part of RF is about 10 μm (4 csp) [1]. Thus, Thiem also recommends a standard deviation of σ_G=⁵^μ₂^m= 2 (csp) [5].

A combination of the CNN-based Laplace-like operator and the CNN-based Gaussian-like operator can be employed to derive the h_G( )k . That is

( )

^Laplace,

(

^Gaussian, G

( ) )

hG k =CNN _ε CNN _λ k . (4-3)

where ε can be any extremely small positive value but not zero, and λ_G can be obtained by performing GA. Based on the experiments performed here, it was concluded that if σ_R equals

2 and g = , then 1 λ_R is approximately 0.567700. (see Table 4-1).

4.1.2 Photoreceptor

An impulse response of a photoreceptor can be represented as a Difference of two Gaussians (DoG).

In the retinal fovea, the DoG can be described as a Gaussian function, as shown below, in most cases [4, 5, 47]

As mentioned in [47], parameter σ_R represents the standard deviation with a range from 1.5 to 12 (csp).

In the proposed approach, a CNN-based Gaussian-like operator is used for the simulation of the Gaussian function. That is

( ( ) ) ( )

Gaussian, R R

CNN _λ δ k ≈g_σ k , (4-5)

where λ_R indicates the diffusion level of the Gaussian-like function. The next question is how to obtain the parameter λ_R and make the final state of the CNN approximate that shown in Eq.(4-4).

To obtain the corresponding value λ_R, Genetic Algorithm (GA) is used again. Based on the results of the experiments, it was concluded that if σ_R equals 1.5, λ_R is approximately 0.536040, and if σR equals 12, then λ_R is approximately 0.071233 (see Table 4-1).

Clearly, the RF of the photoreceptor is used to determine the average intensity of a specific area of a visual signal. Generally, it acts like a low-pass filter. The output of the filter, described in Eq.(4-5), is known as the first order feature.

4.1.3 Horizontal Cell

Eq.(4-1) implies that a horizontal cell can be implemented by the following equation

( )

( ( ) ( ) )

Above it is assumed that the subject system is a monochromatic system. This means that the input of the bipolar and horizontal cells derives from the same set of the photoreceptors. If the inputs of the two types of cells do not derive from the same set of photoreceptors, then the RF of the horizontal cell needs to be known. Assume the input of a horizontal cell derives from photoreceptor system ( )h_R_′ k which is illustrated in Figure 4-5(c), then based on Eq.(4-6), hR H_′

( )

k can be

Horizontal cell determines the difference between the output signal of the photoreceptors in the center and the surroundings of a RF. This kind of feature is termed the second order feature.

Meanwhile, the final feature involves the determination of the parameters b and g . From a biological perspective, value b represents the related weights of the input signals of the horizontal

在文檔中基於細胞神經網路的仿生型電腦視覺系統 (頁 48-0)