Chapter 1 Introduction
1.2 Outline
This thesis is structured as following: An overview of related works about random noise reduction would be given in Chapter 2. Chapter 3 would introduce proposed method represented in system block diagram and algorithms. Based on the proposed method, we would like to discuss experimental results in Chapter 4. At the end of this work, we draw a conclusion in Chapter 5.
Chapter 2
Review of Related Works
Image acquisition devices made up of many kinds of different ways. In this chapter, some specific prior works related to this proposed denoising method will be reviewed. In Section 2.1, we would like to explain why Color Filter Array (CFA) is necessary and what kind of CFA we have taken. In Section 2.2, two of HVS models which have been used in the proposed denoising method would be reviewed. Then, Two kinds of image denoising methods in spatial domain such as bilinear, texture detection based denoising algorithm would be brought up in Section 2.3 and 2.4 respectively since we used them to compare the performance. Finally, what we would like to review are the methods of image quality assessment. They are PSNR and Structural Similarity (SSIM) stated in Section 2.5 and 2.6 respectively.
2.1 Color Filter Array
In general, there are two types of images sensing structure for commercial digital camera.
One is to use 3 or 4 image sensors to capture image of a scene as shown in Fig. 1. In order to make this structure work, optical paths, optical filters and image sensors for each color channel have to be separated to sense the image of a scene. Once image sensors got R, G and B color information, what we need to reproduce color is just picking up R, G and B color information for each corresponding coordinate without doing color interpolation. It works well! However, it’s a very expensive approach, so that only professional digital cameras use this structure.
Fig. 1. Capture an image by three image sensors structure.
The other one, a single image sensor is used to capture image of a scene as shown in Fig.
2. The structure is much simple so as to reduce camera cost. In order to make this structure work, a color filter with mosaic pattern in front of image sensor to separate color information is necessary. So definitely, the resolution is reduced. To reproduce original color and resolution, interpolation is needed accordingly. At the right hand side of Fig. 2, each small square is representing a pixel. Basically, the pixels beneath color filters are light sensitive cells to respond the intensity of light falling on them. Color filter is aligned to sub-sample color information of a scene. There are many kinds of CFA. In digital camera, Bayer pattern CFA [4] with 3 primary colors R, G and B (as known as Bayer pattern) is widely used. As mention before, this thesis is going to discuss noise reduction method mainly based on this Bayer pattern domain.
R CCD
G CCD
B CCD Lens Filter
Scene
Scene Lens Filter Image Sensor
2.2 Bilinear Method
Bilinear is widely used in smoothing edge, typically in the application of re-scaling image size. Also, it can be used in denoising application. The method is described as following equation,
and illustrated as Fig. 3.
Fig. 3. Reproduced an unknown value C(u+du,v+dv) from its neighboring pixels by using bilinear method.
2.3 Human Visual System
Human Visual System (HVS) is complex and not fully understood yet. It’s difficult to use a mathematical function to represent it. It would be more realistic to get some useful information by experiments. There are two important observations on monochrome image which can be used for noise reduction. The first one is Just Noticeable Difference (JND) which is the minimum amount of stimulus intensity must be changed to cause a noticeable difference in sensory experience.
Ernst Weber (1795~1878), an experimental psychologist in 19th century, observed that the size of the JND threshold is related to initial stimulus magnitude. This relationship, had been simplified as Weber's Law by Gustav Theodor Fechner(1801~1887), can be expressed as C(u,v) C(u+1,v)
Easier to detect noise
Noise is less detectable along this edge
Noise is almost undetectable here It’s difficult to detect noise along this edge
ΔI/I = k. where ΔI represents JND threshold, I represents the initial stimulus intensity and k
stands for the proportional constant. In other words, Weber's Law states that the size of the JND (i.e., ΔI) is a constant which is proportional to the original stimulus value.
The second one is spatial masking. Natural images contain large changes in luminance, and these changes suppress the ability of the eyes to detect distortions spatially adjacent to them, this is so-called spatial masking. As a result of masking, noises in images are less detectable along strong edges and in highly textured areas, than in smooth areas of the image as illustrated in Fig. 4.
Fig. 4. Spatial Masking effect.
2.4 Bosco-and-Mancuso Filter for Image Denoising
Bosco and Mancuso [1]-[3] invented an adaptive image filter which is used to reduce the amount of noise in images captured by sensors in Bayer pattern format. The concept of this filter acts mainly on smoothing the high spatial frequency components which are hardly
In Bosco and Mancuso’s paper [1] and patent [2],[3], they made use of the Weber’s Law to determine the JND, ΔI, which could be differentiated between intensity I and I+ΔI. Bosco and Mancuso’s HVS model assume that the uniform areas are the ones with details amplitude under JND. Having these considerations in mind they designed an algorithm that can distinguish if the current processed area is uniform area or not. Once the current processed area is not uniform, the algorithm will go to detect how textural the area is and adapts its filtering strength for the noisy pixel.
The system block diagram they designed is illustrated in Fig. 5. As mentioned above, the local feature detector is to compute texture degree of the area. The estimation is based on the information of distances, noise level, JND and exposure condition from distance and noise level estimator, HVS evaluator and exposure controller respectively. Once texture degree of this current processed area is decided, it would be able to determine the strength of the filtering strength.
Fig. 5. System block diagram of Bosco-and-Mancuso filter.
The algorithm they proposed is to use two different filter masks, depending on which color is current processed pixel. One mask is for green pixels exclusively, the other one is for red/blue pixels, but not operated simultaneously. Fig. 6 is to illustrate those two kinds of operating windows established from Bayer pattern through 2 masks.
Operating Window Establisher
Distance and Noise
Level Estimator
HVS Evaluator
Local Feature Detector
Filtering Strength Calculator Exposure
Controller
Filter
G R G1 R G Fig. 6. Two kinds of operating windows established from Bayer pattern.
Definitely, green operating window is established when current pixel is green. Red and blue operating window will also be established once current pixel is red or blue respectively.
The green operating window is different from red and blue, since the green channel has double information comparing with either red or blue channel in Bayer pattern format.
In each case of operating window for red, green and blue, let’s define the current pixel C0
and eight neighboring pixels C1 to C8 respectively. C will also represent for Ci 1 to C8 to describe easier in many cases. So now we have symbol C0 to C8 or C0 and C to stand for i elements of operating window such as C represent for G when current color channel is green,
C0 is the current pixel with noise needed to be filtered. As described in Bosco and Mancuso’s paper and patent, C0 will be filtered and replaced with a weighted average of C1 to C8. However, how much is the weight of C1 to C8? That will depend on how similar of them to C0. In the design, the higher similarity degree of neighboring pixels, the bigger weight of them to C0. The similarity degree is calculated based on the brightness of current processed pixel and takes into account of the predicted noise level NL of current area as following:
( )
*[
1]
*( )
1, (2)0
0 t =K Dmax + −K NL t−
NLc n n c
where, Dmax is the maximum distance derived from calculating each distance D of i neighboring pixel C to Ci 0 and Kn is a parameter to determine the strength of filtering. In the definition, Kn=1 stands for almost flat area since this area could be filtered strongly, and Kn=0 stands for highest texture area as this area could not be filtered too much.
Recall the assumption of uniform area by using Weber’s Law and refers to the Kn
definition as above, we can assume the curve of Kn versus Dmax can be drawn as Fig. 7. That’s to say, filter current pixel strongly if Dmax not greater than HVS threshold ThHVS and filter current pixel lighter depending on how noisy the current area is. If Dmax is greater than ThHVS
+ NL, then the area has to look as a highly texture area without strongly filter needed.
However, they used Figs. 8 and 9 instead.
Fig. 7. A candidate of Kn curve.
Kn
Dmax
ThHVS ThHVS + NL 1
Fig. 8. Kn curve for G channel in Bosco and Mancuso’s patents.
Fig. 9. Kn curve for R/B channel in Bosco and Mancuso’s patents.
Kn is the overall filtering strength of current processed area. Although the filtering strength is given, we still need to determine the filtering strength of each neighboring pixel C to target pixel Ci 0. Therefore, in order to evaluate similarity of each neighboring pixel C i to the target pixel C0, two boundary thresholds refer to Th1 and Th2 are used to stand for most similar and least similar according to following equations:
(
1)
* , (3)* max min
1 K D K D
Th = n + − n
( )
, (4)* 2 1
* max max min
2 ⎟
⎠
⎜ ⎞
⎝
⎛ +
− +
= D D
K D
K
Th n n
where, Dmin is the minimum distance between C0 to C . The value of similarity i Ki can be determined by
Dmax
Kn
ThHVS + NL 1
Kn
NL ThHVS + NL Dmax
1
)
and Fig. 10 is the illustration for it. At the end, the result of pixel out is expressed as
( )
2.5 Peak Signal to Noise Ratio
Generally, Peak Signal to Noise Ratio (PSNR) is the most regular way used in the metric of distortion level. It is defined as a ratio between possible maximum power of image intensity and the power of distorted difference in terms of logarithmic decibel scale. In our application, given an original image OM*N and processed image PM*N with dimension of M*N, the PSNR is defined as
)
where, Imax represents for the possible maximum power of image intensity and MSE is defined as
2.6 Structural Similarity
In many cases, the objective metric of PSNR and MSE could not correlate well with subjective perception. Therefore, Wang [5] developed Structural Similarity (SSIM) which is based on an assumption that HVS is highly sensitive to structural information of a scene, and is defined as
where α, β and γ are the parameters to define the relative importance of the three components.
The equation l(O,P), c(O,P) and s(O,P) are defined as
)
Chapter 3
Proposed Image Noise Reduction System
The proposed denoising method would be introduced in this chapter. At the beginning, we would like to have brief description of the image processing on digital camera and the noise category we encountered in Section 3.1. Section 3.2 brings up system block diagram of proposed denoising method. At the end, Section 3.3 explains the detail algorithms.
3.1 Overview of Image Processing on Digital Camera
In general, the image processing flow of digital camera is shown as Fig. 11. Once user pressed the shutter button to take an image, the scene in front of camera would be captured by an image sensor, and converted to digital information by analog front end. Normally, black level compensation must be done in front of any processing to calibrate optical black. At the second, lens shading compensation sometimes could be applied. Meanwhile, statistic information for AE, AF and AWB is calculated and stored. And then, save the raw image to memory. What we are going to handle is this raw image. The format of input image is Bayer pattern, and output format of filtered image is Bayer pattern again without any formatting change.
Succeeding to the step of image denoising, color interpolation is normally applied to reproduce the missing color components for each pixel. Since every image sensor has a unique electrical response to light, color matching block is to compensate such kind of deviation. Then, gamma correction is to compensate the non-linearity of display device. YCC conversion block is to transform RGB color domain to YCbCr color domain. Usually, the Y channel will be used to enhance the edge of image. The final step is to perform JPEG compression to output JPEG image.
Fig. 11. Typical block diagram of image processing in digital camera.
As stated in Chapter 1 that we are interested in random noise. Fig. 12 is an example showing the noisy image captured by a digital camera. Gaussian noise distribution is shown in the histogram as Figs. 12 (b), (d), and (f).
Basically, not only random noise but also fix pattern noise in digital camera need to be handled. However fix pattern noise is much smaller than random noise if the camera made used of CCD sensor to capture image. Therefore, for the noise problem of digital camera, the most headache thing is dealing with Gaussian random noise.
Image Sensor
Black Level Compensation
Lens Shading Compensation
Statistic for AE, AF and AWB
White Balance Memory
DRAM
Image
De-noising Color
Interpolation Color
Matching Gamma
Correction
YCC
Conversion
Edge
Enhancement JPEG
compression Y
CC
σ=3.31
(a)
(b)
Fig. 12. Noise distribution illustration from real camera. (a), (b) Image “OECF” be taken under ISO 100 condition, and its noise distribution; (c), (d) Image “OECF” be taken under ISO 400 condition, and its noise distribution; (e), (f) Image “OECF” be taken under ISO 1600 condition, and its noise distribution.
σ=4.01
σ=6.18 (c)
(d)
(e)
3.2 Block Diagram of Proposed Denoising Method
Fig. 13. Block diagram of proposed denoising method.
In the same way, the first step is to establish operating window. Since the 5x5 operating window as shown in Fig. 6 is very common, so that we just use it without change. Once the operating window is determined, a distance calculator is to calculate distance information based on operating window. By feeding the distance information to uniform image distinguisher, the current processed area can be judged as a uniform area, highly texture area or pattern classification needed area based on pre-loaded noise information. Once the processed area has been judged as a uniform area, the succeeding nine pixel mixer will be in charge to output final result and whereas the succeeding bypass bridge will be in charge to output final result while the processed area has been judged as a highly texture area without filtering needed. In the case of processed area is neither a uniform area nor a highly texture area, the succeeding nearest pattern classifier with pattern average filter will be used to output final result. Following Section 3.3 is going to explain the detail about how do they work.
Operating Window
Distance Calculator
Noise Pre-loader
Uniform Image
Distinguisher
Nearest Pattern Classifier
Pattern
Average Filter
Pixel Out Nine
Pixels Mixer
Bypass Bridge
M U X
uniform highly texture classification needed
C1 C2 C3 C4 C0 C5
C6 C7 C8
C1 C2 C3
C4 C0 C5
C6 C7 C8
3.3 Algorithm of Proposed Denoising Method
The same as Chapter 2, the elements C0 ~ C8 of operating window are generically represented for G0 ~ G8, R0 ~ R, and B0 ~ B8 respectively depending on which color is the current pixel to be processed. Fig. 14 is to illustrate the two operating windows which are referred to C0 ~ C8.
(a) (b)
Fig. 14. Two kinds of operating windows. (a) Operating window when current processed color channel is green; (b) Operating window when current processed color channel is red or blue.
Once the operating window is established, the next step is to calculate distance. Distance calculator is going to calculate the distances D of each neighboring pixel to current pixel Ci 0
defined as
) 13 ( 8
2 1
0, i , , , .
C C
Di = i − = L
Also, Dmax and Dmin respectively represent for maximum and minimum distances which can be found once D derived. i
There are twelve pre-set patterns as shown in Fig. 15. Eqs. (14)-(25) are the equations to calculate the distance of those twelve pre-set patterns to C0 when current processed pixel is green.
(a) (b) (c)
(d) (e) (f)
(g) (h) (i)
(j) (k) (l)
Fig. 15. Pre-set twelve patterns when current processed pixel is green. (a) Horizontal line; (b) Vertical line; (c) Rising line; (d) Falling line; (e) Left-Bottom corner; (f) Bottom-Left corner;
(g) Right-Bottom corner; (h) Bottom-Right corner; (i) Right-Top corner; (j) Top-Right corner;
(k) Left-Top corner; (l) Top-Left corner.
)
When current pixel is red or blue, the pre-set patterns are shown in Fig. 16 and Eqs.
(26)-(37) are the equations to calculate distance of those twelve pre-set patterns to C0. The reason of having 2 sets of pattern distance equation is because the index of element is different between green channel and either red or blue channel.
)
(a) (b) (c)
(d) (e) (f)
(g) (h) (i)
(j) (k) (l)
Fig. 16. Pre-set twelve patterns when current processed pixel is red or blue. (a) Horizontal line; (b) Vertical line; (c) Rising line; (d) Falling line; (e) Left-Bottom corner; (f) Bottom-Left corner; (g) Right-Bottom corner; (h) Bottom-Right corner; (i) Right-Top corner; (j) Top-Right corner; (k) Left-Top corner; (l) Top-Left corner.
In each case of pattern sets, once the distances of those patterns to C0 are derived, DmaxPattern and DminPattern which represent for maximum and minimum distances of those patterns to C0 can be found at the same time.
In addition to the distance information, we still need camera noise characteristic which can be obtained by doing noise scanning as a curve shown in Fig. 17.
Fig. 17. Curve fitting for noise characteristic in terms of standard deviation.
Noise pre-loader is doing the job of loading pre-scanned noise level in terms of standard deviation σ which could be obtained by calculating on several flat areas from dark to bright as the small circles shown in Fig. 17. However we would like to use a curve fitted model which has the similar value to the reality noise instead of loading pre-scanned value. Curve fitting
σ
Intensity
Curve Fitting for standard deviation
σmax
_σmax
yield
I
)
I is intensity of the flat area which has the maximum standard deviation there as shown in Fig. 17. In the sense, σmax will be different under different ISO speed. Therefore, σmax has to be found for each ISO speed from pre-scanned noise information.
Having both distance information and camera noise characteristic information, we now can distinguish if the current area is uniform or not by using multiple of σ to be the threshold as
( )
C0 , (39)m Thuniform = ∗σ
where, m is a different constant depend on color channel, ISO speed, and also how much captured images like to be filtered. Generally, if DmaxPattern is less than Thuniform, we would consider current area as a uniform image so that the result of output pixel is the average of C0
to C8. On the other hand, if both DmaxPattern and DminPattern are greater than Thuniform, the current area must be a highly texture area without noise reduction needed since the spatial masking of HVS will work fine here. While the Thuniform is between DmaxPattern and DminPattern, the value of DminPattern will help us to find out which pattern is the nearest pattern. By averaging the element in nearest pattern, the end result of output pixel can be derived. Eq. (40) can help us understand criteria easier.
)
Chapter 4 Experiments
In this chapter, we would like to discuss the experiment results. The proposed method is implemented in matlab language. There are 148 pieces of test images downloaded from USC, and have been classified to miscellaneous, texture and aerial images. Some of them are monochrome, and some are colored. Those color images have been separated into R, G and B
In this chapter, we would like to discuss the experiment results. The proposed method is implemented in matlab language. There are 148 pieces of test images downloaded from USC, and have been classified to miscellaneous, texture and aerial images. Some of them are monochrome, and some are colored. Those color images have been separated into R, G and B