• 沒有找到結果。

Wavelets

在文檔中 Computer Vision: (頁 176-182)

14 Recognition5 Segmentation

3.5 Pyramids and wavelets

3.5.4 Wavelets

space: − =

frequency: − =

low-pass lower-pass

Figure 3.35 The difference of two low-pass filters results in a band-pass filter. The dashed blue lines show the close fit to a half-octave Laplacian of Gaussian.

cent levels, the authors claim that coarse-to-fine algorithms perform better. In the image-processing community, half-octave pyramids combined with checkerboard sampling grids are known as quincunx sampling (Feilner, Van De Ville, and Unser 2005). In detecting multi-scale features (Section4.1.1), it is often common to use half-octave or even quarter-octave pyramids (Lowe 2004;Triggs 2004). However, in this case, the subsampling only occurs at every octave level, i.e., the image is repeatedly blurred with wider Gaussians until a full octave of resolution change has been achieved (Figure4.11).

3.5 Pyramids and wavelets 155

fine medium coarse

l = 0 l = 1 l = 2 l = 3 l = 4

fine medium coarse

l = 0 l = 1 l = 2

(a) (b)

Figure 3.36 Multiresolution pyramids: (a) pyramid with half-octave (quincunx) sampling (odd levels are colored gray for clarity). (b) wavelet pyramid—each wavelet level stores3/4 of the original pixels (usually the horizontal, vertical, and mixed gradients), so that the total number of wavelet coefficients and original pixels is the same.

Wainwright et al. 2003).

Since both image pyramids and wavelets decompose an image into multi-resolution de-scriptions that are localized in both space and frequency, how do they differ? The usual answer is that traditional pyramids are overcomplete, i.e., they use more pixels than the orig-inal image to represent the decomposition, whereas wavelets provide a tight frame, i.e., they keep the size of the decomposition the same as the image (Figure3.36b). However, some wavelet families are, in fact, overcomplete in order to provide better shiftability or steering in orientation (Simoncelli, Freeman, Adelson et al. 1992). A better distinction, therefore, might be that wavelets are more orientation selective than regular band-pass pyramids.

How are two-dimensional wavelets constructed? Figure3.37a shows a high-level dia-gram of one stage of the (recursive) coarse-to-fine construction (analysis) pipeline alongside the complementary re-construction (synthesis) stage. In this diagram, the high-pass filter followed by decimation keeps3/4of the original pixels, while1/4of the low-frequency coef-ficients are passed on to the next stage for further analysis. In practice, the filtering is usually broken down into two separable sub-stages, as shown in Figure3.37b. The resulting three wavelet images are sometimes called the high–high (HH), high–low (HL), and low–high (LH) images. The high–low and low–high images accentuate the horizontal and vertical edges and gradients, while the high–high image contains the less frequently occurring mixed derivatives.

How are the high-passH and low-pass L filters shown in Figure3.37b chosen and how can the corresponding reconstruction filtersI and F be computed? Can filters be designed

H

L ↓2

↓2

L1

Q F

↑2 I

↑2 LH0 HH0

HL0

LH0 HH0

HL0

L1

(a)

Hh ↓2h

L1 LH0

HH0

HL0

Hv ↓2v

Lv ↓2v

Lh ↓2h

Hv ↓2v

Lv ↓2v L1

LH0

HH0

HL0

Q Q Q

Fh

↑2h

Fv

↑2v

Iv

↑2v

Ih

↑2h

Fv

↑2v

Iv

↑2v

(b)

Figure 3.37 Two-dimensional wavelet decomposition: (a) high-level diagram showing the low-pass and high-pass transforms as single boxes; (b) separable implementation, which in-volves first performing the wavelet transform horizontally and then vertically. TheI and F boxes are the interpolation and filtering boxes required to re-synthesize the image from its wavelet components.

that all have finite impulse responses? This topic has been the main subject of study in the wavelet community for over two decades. The answer depends largely on the intended ap-plication, e.g., whether the wavelets are being used for compression, image analysis (feature finding), or denoising.Simoncelli and Adelson(1990b) show (in Table 4.1) some good odd-length quadrature mirror filter (QMF) coefficients that seem to work well in practice.

Since the design of wavelet filters is such a tricky art, is there perhaps a better way? In-deed, a simpler procedure is to split the signal into its even and odd components and then perform trivially reversible filtering operations on each sequence to produce what are called lifted wavelets(Figures3.38and3.39).Sweldens(1996) gives a wonderfully understandable introduction to the lifting scheme for second-generation wavelets, followed by a comprehen-sive review (Sweldens 1997).

As Figure3.38demonstrates, rather than first filtering the whole input sequence (image)

3.5 Pyramids and wavelets 157

H

L ↓2e

↓2o

L1

Q F

↑2e I

↑2o

H0

L1 H0

(a)

↓2o Q

↓2e

L C

L1

H0

L1

H0 ↑2o

↑2e

L C

(b)

Figure 3.38 One-dimensional wavelet transform: (a) usual high-pass + low-pass filters fol-lowed by odd (↓ 2o) and even (↓ 2e) downsampling; (b) lifted version, which first selects the odd and even subsequences and then applies a low-pass prediction stageL and a high-pass correction stageC in an easily reversible manner.

with high-pass and low-pass filters and then keeping the odd and even sub-sequences, the lifting scheme first splits the sequence into its even and odd sub-components. Filtering the even sequence with a low-pass filter L and subtracting the result from the even sequence is trivially reversible: simply perform the same filtering and then add the result back in.

Furthermore, this operation can be performed in place, resulting in significant space savings.

The same applies to filtering the even sequence with the correction filterC, which is used to ensure that the even sequence is low-pass. A series of such lifting steps can be used to create more complex filter responses with low computational cost and guaranteed reversibility.

This process can perhaps be more easily understood by considering the signal processing diagram in Figure3.39. During analysis, the average of the even values is subtracted from the odd value to obtain a high-pass wavelet coefficient. However, the even samples still contain an aliased sample of the low-frequency signal. To compensate for this, a small amount of the high-pass wavelet is added back to the even sequence so that it is properly low-pass filtered.

(It is easy to show that the effective low-pass filter is[−1/8,1/4,3/4,1/4,−1/8], which is

in--½ -½

-½ -½

¼

¼

¼

¼

L0

H0

L1

H1

L2

½

½ -¼ -¼

-¼ -¼

L0

H0

L1

H1

L2

½

½

(a) (b)

Figure 3.39 Lifted transform shown as a signal processing diagram: (a) The analysis stage first predicts the odd value from its even neighbors, stores the difference wavelet, and then compensates the coarser even value by adding in a fraction of the wavelet. (b) The synthesis stage simply reverses the flow of computation and the signs of some of the filters and op-erations. The light blue lines show what happens if we use four taps for the prediction and correction instead of just two.

deed a low-pass filter.) During synthesis, the same operations are reversed with a judicious change in sign.

Of course, we need not restrict ourselves to two-tap filters. Figure 3.39shows as light blue arrows additional filter coefficients that could optionally be added to the lifting scheme without affecting its reversibility. In fact, the low-pass and high-pass filtering operations can be interchanged, e.g., we could use a five-tap cubic low-pass filter on the odd sequence (plus center value) first, followed by a four-tap cubic low-pass predictor to estimate the wavelet, although I have not seen this scheme written down.

Lifted wavelets are called second-generation wavelets because they can easily adapt to non-regular sampling topologies, e.g., those that arise in computer graphics applications such as multi-resolution surface manipulation (Schr¨oder and Sweldens 1995). It also turns out that lifted weighted wavelets, i.e., wavelets whose coefficients adapt to the underlying problem being solved (Fattal 2009), can be extremely effective for low-level image manipulation tasks and also for preconditioning the kinds of sparse linear systems that arise in the optimization-based approaches to vision algorithms that we discuss in Section3.7(Szeliski 2006b).

An alternative to the widely used “separable” approach to wavelet construction, which de-composes each level into horizontal, vertical, and “cross” sub-bands, is to use a representation that is more rotationally symmetric and orientationally selective and also avoids the aliasing inherent in sampling signals below their Nyquist frequency.17 Simoncelli, Freeman, Adelson et al.(1992) introduce such a representation, which they call a pyramidal radial frequency

17Such aliasing can often be seen as the signal content moving between bands as the original signal is slowly shifted.

3.5 Pyramids and wavelets 159

(a)

(b)

(c)

(d)

Figure 3.40 Steerable shiftable multiscale transforms (Simoncelli, Freeman, Adelson et al.

1992) c 1992 IEEE: (a) radial multi-scale frequency domain decomposition; (b) original image; (c) a set of four steerable filters; (d) the radial multi-scale wavelet decomposition.

implementationof shiftable multi-scale transforms or, more succinctly, steerable pyramids.

Their representation is not only overcomplete (which eliminates the aliasing problem) but is also orientationally selective and has identical analysis and synthesis basis functions, i.e., it is self-inverting, just like “regular” wavelets. As a result, this makes steerable pyramids a much more useful basis for the structural analysis and matching tasks commonly used in computer vision.

Figure3.40a shows how such a decomposition looks in frequency space. Instead of re-cursively dividing the frequency domain into2 × 2 squares, which results in checkerboard high frequencies, radial arcs are used instead. Figure3.40b illustrates the resulting pyramid sub-bands. Even through the representation is overcomplete, i.e., there are more wavelet co-efficients than input pixels, the additional frequency and orientation selectivity makes this representation preferable for tasks such as texture analysis and synthesis (Portilla and Simon-celli 2000) and image denoising (Portilla, Strela, Wainwright et al. 2003;Lyu and Simoncelli 2009).

(a) (b)

(c) (d)

Figure 3.41 Laplacian pyramid blending (Burt and Adelson 1983b) c 1983 ACM: (a) orig-inal image of apple, (b) origorig-inal image of orange, (c) regular splice, (d) pyramid blend.

在文檔中 Computer Vision: (頁 176-182)

相關文件