Non-linear filtering - More neighborhood operators

14 Recognition5 Segmentation

3.3 More neighborhood operators

3.3.1 Non-linear filtering

Haffner et al. 1998). In principle, summed area tables could also be used to compute the sums in the sum of squared differences (SSD) stereo and motion algorithms (Section11.4). In practice, separable moving average filters are usually preferred (Kanade, Yoshida, Oda et al.

1996), unless many different window shapes and sizes are being considered (Veksler 2003).

Recursive filtering

The incremental formula (3.31) for the summed area is an example of a recursive filter, i.e., one whose values depends on previous filter outputs. In the signal processing literature, such filters are known as infinite impulse response (IIR), since the output of the filter to an impulse (single non-zero value) goes on forever. For example, for a summed area table, an impulse generates an infinite rectangle of 1s below and to the right of the impulse. The filters we have previously studied in this chapter, which involve the image with a finite extent kernel, are known as finite impulse response (FIR).

Two-dimensional IIR filters and recursive formulas are sometimes used to compute quan-tities that involve large area interactions, such as two-dimensional distance functions (Sec-tion3.3.3) and connected components (Section3.3.4).

More commonly, however, IIR filters are used inside one-dimensional separable filtering stages to compute large-extent smoothing kernels, such as efficient approximations to Gaus-sians and edge filters (Deriche 1990;Nielsen, Florack, and Deriche 1997). Pyramid-based algorithms (Section3.5) can also be used to perform such large-area smoothing computations.

3.3 More neighborhood operators 123

(a) (b) (c) (d)

(e) (f) (g) (h)

Figure 3.18 Median and bilateral filtering: (a) original image with Gaussian noise; (b) Gaus-sian filtered; (c) median filtered; (d) bilaterally filtered; (e) original image with shot noise; (f) Gaussian filtered; (g) median filtered; (h) bilaterally filtered. Note that the bilateral filter fails to remove the shot noise because the noisy pixels are too different from their neighbors.

. 2 1 0 1 2

1 2 1 2 4 1 2 1 2 4 2 0.1 0.3 0.4 0.3 0.1 0.0 0.0 0.0 0.0 0.2

2 1 3 5 8 2 1 3 5 8 ¹ 0.3 0.6 0.8 0.6 0.3 0.0 0.0 0.0 0.4 0.8

1 3 7 6 9 1 3 7 6 9 0 0.4 0.8 1.0 0.8 0.4 0.0 0.0 1.0 0.8 0.4

3 4 8 6 7 3 4 8 6 7 1 0.3 0.6 0.8 0.6 0.3 0.0 0.2 0.8 0.8 1.0

4 5 7 8 9 4 5 7 8 9 ² 0.1 0.3 0.4 0.3 0.1 0.2 0.4 1.0 0.8 0.4

(a) median = 4 (b) α-mean= 4.6 (c) domain filter (d) range filter

Figure 3.19 Median and bilateral filtering: (a) median pixel (green); (b) selectedα-trimmed mean pixels; (c) domain filter (numbers along edge are pixel distances); (d) range filter.

noise, rather than being Gaussian, is shot noise, i.e., it occasionally has very large values. In this case, regular blurring with a Gaussian filter fails to remove the noisy pixels and instead turns them into softer (but still visible) spots (Figure3.18f).

Median filtering

A better filter to use in this case is the median filter, which selects the median value from each pixel’s neighborhood (Figure3.19a). Median values can be computed in expected linear time using a randomized select algorithm (Cormen 2001) and incremental variants have also been developed byTomasi and Manduchi(1998) andBovik(2000, Section 3.2). Since the shot noise value usually lies well outside the true values in the neighborhood, the median filter is able to filter away such bad pixels (Figure3.18c).

One downside of the median filter, in addition to its moderate computational cost, is that since it selects only one input pixel value to replace each output pixel, it is not as efficient at averaging away regular Gaussian noise (Huber 1981;Hampel, Ronchetti, Rousseeuw et al.

1986;Stewart 1999). A better choice may be theα-trimmed mean (Lee and Redner 1990) (Crane 1997, p. 109), which averages together all of the pixels except for theα fraction that are the smallest and the largest (Figure3.19b).

Another possibility is to compute a weighted median, in which each pixel is used a num-ber of times depending on its distance from the center. This turns out to be equivalent to minimizing the weighted objective function

k,l

w(k, l)|f(i + k, j + l) − g(i, j)|^p, (3.33)

whereg(i, j) is the desired output value and p = 1 for the weighted median. The value p = 2 is the usual weighted mean, which is equivalent to correlation (3.12) after normalizing by the sum of the weights (Bovik 2000, Section 3.2) (Haralick and Shapiro 1992, Section 7.2.6).

The weighted mean also has deep connections to other methods in robust statistics (see Ap-pendixB.3), such as influence functions (Huber 1981;Hampel, Ronchetti, Rousseeuw et al.

1986).

Non-linear smoothing has another, perhaps even more important property, especially since shot noise is rare in today’s cameras. Such filtering is more edge preserving, i.e., it has less tendency to soften edges while filtering away high-frequency noise.

Consider the noisy image in Figure 3.18a. In order to remove most of the noise, the Gaussian filter is forced to smooth away high-frequency detail, which is most noticeable near strong edges. Median filtering does better but, as mentioned before, does not do as good a job at smoothing away from discontinuities. See (Tomasi and Manduchi 1998) for some additional references to edge-preserving smoothing techniques.

3.3 More neighborhood operators 125 While we could try to use theα-trimmed mean or weighted median, these techniques still have a tendency to round sharp corners, since the majority of pixels in the smoothing area come from the background distribution.

Bilateral filtering

What if we were to combine the idea of a weighted filter kernel with a better version of outlier rejection? What if instead of rejecting a fixed percentageα, we simply reject (in a soft way) pixels whose values differ too much from the central pixel value? This is the essential idea in bilateral filtering, which was first popularized in the computer vision community byTomasi and Manduchi(1998).Chen, Paris, and Durand(2007) andParis, Kornprobst, Tumblin et al.

(2008) cite similar earlier work (Aurich and Weule 1995;Smith and Brady 1997) as well as the wealth of subsequent applications in computer vision and computational photography.

In the bilateral filter, the output pixel value depends on a weighted combination of neigh-boring pixel values

g(i, j) = P

k,lf (k, l)w(i, j, k, l) P

k,lw(i, j, k, l) . (3.34)

The weighting coefficientw(i, j, k, l) depends on the product of a domain kernel (Figure3.19c), d(i, j, k, l) = exp

−(i − k)²+ (j − l)² 2σ²_d

, (3.35)

and a data-dependent range kernel (Figure3.19d), r(i, j, k, l) = exp

−kf(i, j) − f(k, l)k² 2σ²_r

. (3.36)

When multiplied together, these yield the data-dependent bilateral weight function w(i, j, k, l) = exp

−(i − k)²+ (j − l)²

2σ_d² −kf(i, j) − f(k, l)k² 2σ²_r

. (3.37)

Figure3.20shows an example of the bilateral filtering of a noisy step edge. Note how the do-main kernel is the usual Gaussian, the range kernel measures appearance (intensity) similarity to the center pixel, and the bilateral filter kernel is a product of these two.

Notice that the range filter (3.36) uses the vector distance between the center and the neighboring pixel. This is important in color images, since an edge in any one of the color bands signals a change in material and hence the need to downweight a pixel’s influence.⁵

5 Tomasi and Manduchi(1998) show that using the vector distance (as opposed to filtering each color band separately) reduces color fringing effects. They also recommend taking the color difference in the more perceptually uniform CIELAB color space (see Section2.3.2).

(a) (b) (c)

(d) (e) (f)

Figure 3.20 Bilateral filtering (Durand and Dorsey 2002) c 2002 ACM: (a) noisy step edge input; (b) domain filter (Gaussian); (c) range filter (similarity to center pixel value); (d) bilateral filter; (e) filtered step edge output; (f) 3D distance between pixels.

Since bilateral filtering is quite slow compared to regular separable filtering, a number of acceleration techniques have been developed (Durand and Dorsey 2002;Paris and Durand 2006;Chen, Paris, and Durand 2007;Paris, Kornprobst, Tumblin et al. 2008). Unfortunately, these techniques tend to use more memory than regular filtering and are hence not directly applicable to filtering full-color images.

Iterated adaptive smoothing and anisotropic diffusion

Bilateral (and other) filters can also be applied in an iterative fashion, especially if an appear-ance more like a “cartoon” is desired (Tomasi and Manduchi 1998). When iterated filtering is applied, a much smaller neighborhood can often be used.

Consider, for example, using only the four nearest neighbors, i.e., restricting|k − i| + |l − j| ≤ 1 in (3.34). Observe that

d(i, j, k, l) = exp

−(i − k)²+ (j − l)² 2σ_d²

(3.38)

( 1, |k − i| + |l − j| = 0,

λ = e^−1/2σ^d², |k − i| + |l − j| = 1. (3.39)

3.3 More neighborhood operators 127 We can thus re-write (3.34) as

f^(t+1)(i, j) = f^(t)(i, j) + η P_k,lf^(t)(k, l)r(i, j, k, l)

1 + η P_k,lr(i, j, k, l) (3.40)

= f^(t)(i, j) + η 1 + ηR

k,l

r(i, j, k, l)[f^(t)(k, l) − f^(t)(i, j)],

whereR = P

(k,l)r(i, j, k, l), (k, l) are theN⁴ neighbors of(i, j), and we have made the iterative nature of the filtering explicit.

AsBarash(2002) notes, (3.40) is the same as the discrete anisotropic diffusion equation first proposed byPerona and Malik(1990b).⁶ Since its original introduction, anisotropic dif-fusion has been extended and applied to a wide range of problems (Nielsen, Florack, and De-riche 1997;Black, Sapiro, Marimont et al. 1998;Weickert, ter Haar Romeny, and Viergever 1998;Weickert 1998). It has also been shown to be closely related to other adaptive smooth-ingtechniques (Saint-Marc, Chen, and Medioni 1991;Barash 2002;Barash and Comaniciu 2004) as well as Bayesian regularization with a non-linear smoothness term that can be de-rived from image statistics (Scharr, Black, and Haussecker 2003).

In its general form, the range kernelr(i, j, k, l) = r(kf(i, j) − f(k, l)k), which is usually called the gain or edge-stopping function, or diffusion coefficient, can be any monotonically increasing function withr⁰(x) → 0 as x → ∞. Black, Sapiro, Marimont et al.(1998) show how anisotropic diffusion is equivalent to minimizing a robust penalty function on the image gradients, which we discuss in Sections 3.7.1and3.7.2). Scharr, Black, and Haussecker (2003) show how the edge-stopping function can be derived in a principled manner from local image statistics. They also extend the diffusion neighborhood fromN⁴ toN⁸, which allows them to create a diffusion operator that is both rotationally invariant and incorporates information about the eigenvalues of the local structure tensor.

Note that, without a bias term towards the original image, anisotropic diffusion and itera-tive adapitera-tive smoothing converge to a constant image. Unless a small number of iterations is used (e.g., for speed), it is usually preferable to formulate the smoothing problem as a joint minimization of a smoothness term and a data fidelity term, as discussed in Sections3.7.1 and3.7.2and byScharr, Black, and Haussecker (2003), which introduce such a bias in a principled manner.

在文檔中 Computer Vision: (頁 144-149)