Exercises - Recognition5 Segmentation

14 Recognition5 Segmentation

2.5 Exercises

multi-view geometry in a thorough way, I encourage you to read and do the exercises provided byHartley and Zisserman(2004). Similarly, if you want some exercises related to the image formation process, Glassner’s (1995) book is full of challenging problems.

Ex 2.1: Least squares intersection point and line fitting—advanced Equation (2.4) shows how the intersection of two 2D lines can be expressed as their cross product, assuming the lines are expressed as homogeneous coordinates.

1. If you are given more than two lines and want to find a pointx that minimizes the sum˜ of squared distances to each line,

D =X

(˜x· ˜lⁱ)², (2.120)

how can you compute this quantity? (Hint: Write the dot product asx˜^T˜liand turn the squared quantity into a quadratic form,x˜^TA˜x.)

2. To fit a line to a bunch of points, you can compute the centroid (mean) of the points as well as the covariance matrix of the points around this mean. Show that the line passing through the centroid along the major axis of the covariance ellipsoid (largest eigenvector) minimizes the sum of squared distances to the points.

3. These two approaches are fundamentally different, even though projective duality tells us that points and lines are interchangeable. Why are these two algorithms so appar-ently different? Are they actually minimizing different objectives?

Ex 2.2: 2D transform editor Write a program that lets you interactively create a set of rectangles and then modify their “pose” (2D transform). You should implement the following steps:

1. Open an empty window (“canvas”).

2. Shift drag (rubber-band) to create a new rectangle.

3. Select the deformation mode (motion model): translation, rigid, similarity, affine, or perspective.

4. Drag any corner of the outline to change its transformation.

This exercise should be built on a set of pixel coordinate and transformation classes, either implemented by yourself or from a software library. Persistence of the created representation (save and load) should also be supported (for each rectangle, save its transformation).

2.5 Exercises 95 Ex 2.3: 3D viewer Write a simple viewer for 3D points, lines, and polygons. Import a set of point and line commands (primitives) as well as a viewing transform. Interactively modify the object or camera transform. This viewer can be an extension of the one you created in (Exercise2.2). Simply replace the viewing transformations with their 3D equivalents.

(Optional) Add a z-buffer to do hidden surface removal for polygons.

(Optional) Use a 3D drawing package and just write the viewer control.

Ex 2.4: Focus distance and depth of field Figure out how the focus distance and depth of field indicators on a lens are determined.

1. Compute and plot the focus distancezoas a function of the distance traveled from the focal length∆zi= f − zifor a lens of focal lengthf (say, 100mm). Does this explain the hyperbolic progression of focus distances you see on a typical lens (Figure2.20)?

2. Compute the depth of field (minimum and maximum focus distances) for a given focus setting zo as a function of the circle of confusion diameterc (make it a fraction of the sensor width), the focal lengthf , and the f-stop number N (which relates to the aperture diameterd). Does this explain the usual depth of field markings on a lens that bracket the in-focus marker, as in Figure2.20a?

3. Now consider a zoom lens with a varying focal lengthf . Assume that as you zoom, the lens stays in focus, i.e., the distance from the rear nodal point to the sensor plane ziadjusts itself automatically for a fixed focus distancezo. How do the depth of field indicators vary as a function of focal length? Can you reproduce a two-dimensional plot that mimics the curved depth of field lines seen on the lens in Figure2.20b?

Ex 2.5: F-numbers and shutter speeds List the common f-numbers and shutter speeds that your camera provides. On older model SLRs, they are visible on the lens and shut-ter speed dials. On newer cameras, you have to look at the electronic viewfinder (or LCD screen/indicator) as you manually adjust exposures.

1. Do these form geometric progressions; if so, what are the ratios? How do these relate to exposure values (EVs)?

2. If your camera has shutter speeds of₆₀¹ and₁₂₅¹ , do you think that these two speeds are exactly a factor of two apart or a factor of125/60 = 2.083 apart?

3. How accurate do you think these numbers are? Can you devise some way to measure exactly how the aperture affects how much light reaches the sensor and what the exact exposure times actually are?

Ex 2.6: Noise level calibration Estimate the amount of noise in your camera by taking re-peated shots of a scene with the camera mounted on a tripod. (Purchasing a remote shutter release is a good investment if you own a DSLR.) Alternatively, take a scene with constant color regions (such as a color checker chart) and estimate the variance by fitting a smooth function to each color region and then taking differences from the predicted function.

1. Plot your estimated variance as a function of level for each of your color channels separately.

2. Change the ISO setting on your camera; if you cannot do that, reduce the overall light in your scene (turn off lights, draw the curtains, wait until dusk). Does the amount of noise vary a lot with ISO/gain?

3. Compare your camera to another one at a different price point or year of make. Is there evidence to suggest that “you get what you pay for”? Does the quality of digital cameras seem to be improving over time?

Ex 2.7: Gamma correction in image stitching Here’s a relatively simple puzzle. Assume you are given two images that are part of a panorama that you want to stitch (see Chapter9).

The two images were taken with different exposures, so you want to adjust the RGB values so that they match along the seam line. Is it necessary to undo the gamma in the color values in order to achieve this?

Ex 2.8: Skin color detection Devise a simple skin color detector (Forsyth and Fleck 1999;

Jones and Rehg 2001;Vezhnevets, Sazonov, and Andreeva 2003;Kakumanu, Makrogiannis, and Bourbakis 2007) based on chromaticity or other color properties.

1. Take a variety of photographs of people and calculate the xy chromaticity values for each pixel.

2. Crop the photos or otherwise indicate with a painting tool which pixels are likely to be skin (e.g. face and arms).

3. Calculate a color (chromaticity) distribution for these pixels. You can use something as simple as a mean and covariance measure or as complicated as a mean-shift segmenta-tion algorithm (see Secsegmenta-tion5.3.2). You can optionally use non-skin pixels to model the background distribution.

4. Use your computed distribution to find the skin regions in an image. One easy way to visualize this is to paint all non-skin pixels a given color, such as white or black.

5. How sensitive is your algorithm to color balance (scene lighting)?

2.5 Exercises 97 6. Does a simpler chromaticity measurement, such as a color ratio (2.116), work just as

well?

Ex 2.9: White point balancing—tricky A common (in-camera or post-processing) tech-nique for performing white point adjustment is to take a picture of a white piece of paper and to adjust the RGB values of an image to make this a neutral color.

1. Describe how you would adjust the RGB values in an image given a sample “white color” of(Rw, Gw, Bw) to make this color neutral (without changing the exposure too much).

2. Does your transformation involve a simple (per-channel) scaling of the RGB values or do you need a full3 × 3 color twist matrix (or something else)?

3. Convert your RGB values to XYZ. Does the appropriate correction now only depend on the XY (or xy) values? If so, when you convert back to RGB space, do you need a full3 × 3 color twist matrix to achieve the same effect?

4. If you used pure diagonal scaling in the direct RGB mode but end up with a twist if you work in XYZ space, how do you explain this apparent dichotomy? Which approach is correct? (Or is it possible that neither approach is actually correct?)

If you want to find out what your camera actually does, continue on to the next exercise.

Ex 2.10: In-camera color processing—challenging If your camera supports a RAW pixel mode, take a pair of RAW and JPEG images, and see if you can infer what the camera is doing when it converts the RAW pixel values to the final color-corrected and gamma-compressed eight-bit JPEG pixel values.

1. Deduce the pattern in your color filter array from the correspondence between co-located RAW and color-mapped pixel values. Use a color checker chart at this stage if it makes your life easier. You may find it helpful to split the RAW image into four separate images (subsampling even and odd columns and rows) and to treat each of these new images as a “virtual” sensor.

2. Evaluate the quality of the demosaicing algorithm by taking pictures of challenging scenes which contain strong color edges (such as those shown in in Section10.3.1).

3. If you can take the same exact picture after changing the color balance values in your camera, compare how these settings affect this processing.

4. Compare your results against those presented byChakrabarti, Scharstein, and Zickler (2009) or use the data available in their database of color images.²⁶

26http://vision.middlebury.edu/color/.

Chapter 3

Image processing

3.1 Point operators. . . 101 3.1.1 Pixel transforms . . . 103 3.1.2 Color transforms . . . 104 3.1.3 Compositing and matting . . . 105 3.1.4 Histogram equalization . . . 107 3.1.5 Application: Tonal adjustment . . . 111 3.2 Linear filtering. . . 111 3.2.1 Separable filtering . . . 115 3.2.2 Examples of linear filtering . . . 117 3.2.3 Band-pass and steerable filters . . . 118 3.3 More neighborhood operators . . . 122 3.3.1 Non-linear filtering . . . 122 3.3.2 Morphology . . . 127 3.3.3 Distance transforms . . . 129 3.3.4 Connected components . . . 131 3.4 Fourier transforms . . . 132 3.4.1 Fourier transform pairs . . . 136 3.4.2 Two-dimensional Fourier transforms . . . 140 3.4.3 Wiener filtering . . . 140 3.4.4 Application: Sharpening, blur, and noise removal. . . 144 3.5 Pyramids and wavelets. . . 144 3.5.1 Interpolation. . . 145 3.5.2 Decimation . . . 148 3.5.3 Multi-resolution representations. . . 150 3.5.4 Wavelets. . . 154 3.5.5 Application: Image blending . . . 160 3.6 Geometric transformations . . . 162 3.6.1 Parametric transformations . . . 163 3.6.2 Mesh-based warping . . . 170 3.6.3 Application: Feature-based morphing. . . 173 3.7 Global optimization . . . 174 3.7.1 Regularization. . . 174 3.7.2 Markov random fields . . . 180 3.7.3 Application: Image restoration . . . 192 3.8 Additional reading . . . 192 3.9 Exercises. . . 194

(a) (b)

(e) (f)

Figure 3.1 Some common image processing operations: (a) original image; (b) increased contrast; (c) change in hue; (d) “posterized” (quantized colors); (e) blurred; (f) rotated.

在文檔中 Computer Vision: (頁 115-123)