Coherent Texture Synthesis for Photograph Relighting and Texture Replacement

(1)

Master Thesis

Coherent Texture Synthesis for Photograph Relighting and Texture Replacement

Student:

^Ë

(Yuan-Chung Shen)

Advisors:

(Yung-Yu Chuang, PhD.)

(Ming Ouhyoung, PhD.)

Department of Computer Science and Information Engineering

National Taiwan University, Taipei, Taiwan

(2)

(3)

Photometric Stereo

(4)

(5)

Abstract

In this thesis, we present a texture synthesis method for realistic rendering without acquiring the actual reflectance properties. The method is from the observation that most objects have texture repeatedly laid on different locations of their surface explicitly or implicitly. These samples form a pretty sparse set of BTFs. Our goal is to extract the underlying texture from photographs of the object, and to transfer the texture to other objects with different illumination. This method is very simple and provides pseudo-BTFs reflectance, using texture synthesis approaches.

The input data is acquired directly from a few of photographs. These photographs contain appearances of the object with diﬀerent lighting directions at a ﬁxed viewpoint. We use Photometric Stereo to estimate the normal map and the unshaded image in preprocessing.

Then, diﬀerent texture points (texels) are recognized by their pixel values with neighborhood matching, using segmentation and labeling. Therefore, we can transfer the texture to another object and render it under diﬀerent illumination conditions using these clustered texels.

(6)

(7)

List of Figures

1.1 System Overview . . . 3 3.1 Light Calibration . . . 10 3.2 Photometric Stereo. The left image is one of the captured ”melon” pho-

tographs, the middle one is the recovered normal map, and the right one is the unshaded image. . . 11 4.1 Texel Clustering . . . 14 4.2 Shading Maps. . . 15 4.3 Pseudo BTF reconstruction. Applying shading maps on the original object.

The radiance is increasing linearly for each image. . . 15 4.4 Relighting. These images are the results of relighting under diﬀerent illumi-

nation model and diﬀerent light directions, using constructed shading maps.

The left columns are Lambertian model, while the right ones are Phong model. 17 5.1 Image Analogies. A and A’ are the normal maps of the source and target

objects, while B and B’ are the unshaded images of the source and target object. . . 20 5.2 Results. These images are the results of relighting the synthesized target

object under Lambertian model (left column) and Phong model (right column) with diﬀerent light directions . . . 22 6.1 Texture transfer and relighting for synthetic objects. The images in the left

column are rendered in OpenGL as our source objects. The images in the middle and right columns are generated by our method and rendered in Lam- bertian Model and in Phong Model respectively. . . . 24

(10)

6.2 Texture transfer and relighting for real objects. The images in the left column are the source objects. The images in the middle and right columns are generated by our method and rendered in Lambertian Model and in Phong Model respectively. . . . 25 6.3 Texture transfer and relighting for real objects. The images in the left column

are the source objects. The images in the middle and right columns are generated by our method and rendered in Lambertian Model and in Phong Model respectively. . . . 27 6.4 Rendering with environment mapping. . . 27

(11)

Chapter 1 Introduction

Rendering objects realistically has been one of the major goals in research and commercial use for years. Computer graphics applications increasingly require textures to represent del- icate surfaces with highly reﬂective properties without modeling geometric details. However, constructing a surface with complicated appearances manually is an extremely tough and tedious task. Besides, as the development of graphics hardware, in game or ﬁlm industries, programmers often use shading languages to develop shaders for modeling the appearance of various materials. Nevertheless, it is still very common that a generalized surface shader has dozens of arguments to accomplish and causes developers to be exhausted for tuning the parameters.

A number of techniques have been discussed to address such a problem, and lots of solu- tions have been proposed to approximate the goal. Algorithms exist for synthesizing a wide variety of textures over surfaces, but they may require more information about geometric details. On the other hand, more rendering artifacts with complex reﬂectance properties are desired in practical use. Typically, generating realistic results requires knowledge of accurate light-transport models, such as Bidirectional Reflectance Distribution Functions (BRDFs) or Bidirectional Texture Functions (BTFs). Unfortunately, full coverage of all viewing and lighting directions is diﬃcult to acquired or estimated. Recently, image-based rendering have come to notice since appropriate synthesis of real photographs provides appealing results.

However, these methods could not be easy to carry out. Thus, our goal is to create realistic results without much diﬃculty.

Generally speaking, photographs are usually taken under complex illumination environment (e.g. outdoor), but the lighting condition is not acquired simultaneously. Also, we have no idea about the geometry of the objects we take in the photographs. Thus, by analyzing

(12)

the shading of objects to acquire their reﬂectance seems not applicable since the illumina- tion and geometry remain unknown. In our work, we do not use techniques, such as shape from shading, to analyze the shading of objects in photographs. Instead, we directly use the distribution of the shading as pseudo-reﬂectance for rendering facilities. Although this analysis is not physically correct, our pseudo rendering is still transparent for human vision systems since our eyes are tolerant towards such an slight defect.

The idea of this work is based on the observation of [HE03]. They use a photograph of a sphere or ellipsoid of the source material as the exemplar and generate a realistic rendering of a input 3D surface by sampling from it non-uniformly. Their method is very simple but produces realistic results. However, they can only render the target model with the same illumination as the input photograph. If the input illumination changes, the target rendering requires re-synthesis and causes the inconsistency of the textured appearance. Our work extends their approach and tries to keep the consistency of the texture under diﬀerent illumination.

1.1 Problem statement

Our work aims at photo-realistic rendering from the view of texture synthesis techniques.

For non-emitting objects, the image formation can be computed according to the rendering equation [Kaj86], expressed as

I(x) =

Ωρ(x, ω_i, ω_o)L(ω_i)(ω_i· N)dω_i,

where I is the pixel value as a function of position x, L is the lighting vector with incident direction ω_i over a hemisphere Ω around the surface normal N, and ρ is the reﬂectance function which governs the amount of light into the viewpoint with the direction ω_o from all directions ω_i. With the purpose of manipulating textured objects, we are interested in two phases:

• ﬁll ρ from pretty sparse samples of input photographs for diﬀerent illumination L(ω_i)

• re-arrange ρ for all x by matching new N in order to replace the texture on other surfaces, using texture synthesis methods

However, in the above equation, there are two unknown variables: the reﬂectance function ρ, and the surface normal N. All we know are the pixel values I and the light L. Because our source object is illuminated in a single point light source, we can eliminate the integration over the hemisphere. What we do is to ﬁll the function of ρ and use it for another surface.

(13)

Normal Map Unshaded

Image

Index (Texture)

Map

Rough Normal Map Input Images

Photometric Stereo

Low-pass filter

New Index

Map

New Object Texture

Replacement

&

Relighting

Texel Clustering Shading

Maps

Normal Map

Rough Normal Map

Low-pass filter

Relighting

Image Analogies

Figure 1.1: System Overview

1.2 System overview and thesis organization

Figure 1.1 is the overview of this thesis. The input data is acquired by simple acquisition. We ﬁx the viewpoint and change the lighting directions to capture several photographs. Then, we will use Photometric Stereo to estimate the normal map and the unshaded image of the source object. With the unshaded image, we can guess the distribution of the underlying texture points by our clustering method and generate an index map for the object. Such an index map can generate shading maps from input images for real-time rendering. Besides, we can also use this index map to texture on other objects according to the combined consideration of the texture coherence and the geometry surface, using Image Analogies.

The thesis is organized as follows. Related work about illumination models and texture synthesis is summarized brieﬂy in Chapter 2. Data acquisition, calibration, and preprocess are described in Chapter 3. The key component, texel clustering, is introduced in Chapter 4. After recognizing texels, texture transferring with Image Analogies can be achieved in Chapter 5. The experiment results are given in Chapter 6. We will make a brief conclusion and discuss some future work in Chapter 7.

(14)

(15)

Chapter 2 Related work

One of the major challenges in computer graphics is photo-realistic rendering of complex materials. For accurate rendering of real-world objects, image-based rendering plays an important part in such an area. An important problem towards the acquisition of the geometry and reflectance properties of real objects. Recent years, many researchers make their efforts on using multiple photographs to recover the reflection model of certain material for photo-like rendering. However, there are still many unknown variables in considering truly situation, which requires more sophisticated and elaborate models of surface properties. Our work treats several photographs of a source material as a set of BTF samples and samples from it to synthesize a rendering. For this purpose, several related research in different aspects will be discussed in this chapter.

2.1 BRDFs and BTFs

In order to describe the light-transport response in the real world, many illumination model are proposed for realistic rendering. A simplified local reflection model called Bidirectional Reflectance Distribution Function (BRDF) is proposed and characterizes the process where light transport occurs at an idealized surface point. BRDF is a function of four variables defined on the cross product of two hemispheres, and it describes the ratio of reflected radiance to incident flux per unit area:

ρ(θ_i, φ_i, θ_r, φ_r) = dL_r(θ_r, φ_r) dE_i(θ_i, φ_i),

where L is the radiance, E is the irradiance, (θ_i, φ_i) are the incident angles, and (θ_r, φ_r) are the reﬂected angles.

(16)

Matusik et al. [MPBM03] introduced a data-driven reﬂectance model for isotropic BRDFs.

Each acquired image of sample sphere represents a dense set of BRDF samples, and each pixel of the sphere can be treated as a separate BRDF measurement. Nevertheless, traditional method for capturing full coverage of BRDFs takes lots of time and storage. Recent research involves estimation of the reflectance properties of the material by performing sparse sampling with interpolation and extrapolation. Kautz et al. [KSS⁺04] proposed a technique for the easy acquisition of realistic materials without acquiring the actual BRDFs. They captured several images for different light directions in a fixed orthogonal viewing direction.

These set of acquired data forms a shading map, in which the average radiance is increasing linearly for every image. Therefore, real-time rendering can be performed by simply table lookup of this shading map according to the chosen BRDFs. Using extremely simple equip- ment, Paterson et al. [PCF05] presented a new method for BRDF parameters and surface geometry capturing with a standard digital camera and an attached ﬂash as an portable capture device. They can capture materials with spatially varying BRDF and recover geometry and BRDF parameters with fewer samples on a roughly planar object.

Although BRDF is proposed to describe reflectance properties of certain material, there are still some problems that are not concerned about. In real world, many materials exhibit rich reflection properties which are influenced as a result of small-scale geometry details.

Dana et al. [DvGNK99] proposed a new texture representation called Bidirectional Texture Function (BTF) to model visual appearance of both reflectance and meso-structure. BTF is a six dimensional function, described as BT F (x, y, θ_i, φ_i, θ_r, φ_r), where (x, y) are the texture coordinates, (θ_i, φ_i) are the light direction, and (θ_r, φ_r) are the viewing direction. It extends conventional 2D texture with varying lighting and viewing directions, which captures many visual effects such as translucency, inter-reflection, shadow, and occlusion effects of real world material. Unfortunately, BTF acquisition needs more dense samples to recover the reflection properties with different viewing and light directions. Liu et al. [LYS01] introduced a novel approach for synthesizing large-scale BTFs. They recover an approximate surface geometry of measured sample using a shape-from-shading method, and regard the height field as a gray scale image to synthesize a new surface with traditional 2D texture synthesis algorithms. Using a reference BTF image, they can synthesize BTF on the new surface geometry by applying local appearance preserving texture synthesis.

Without much diﬃculty, Haro et al. [HE03] presented a computationally inexpensive method for rendering a 3D surface using a single photograph. The concept from Haro et al.

is like the inverse of the work by Hertzmann [HS03], whick also uses a reference sphere but

(17)

to reconstruct the reﬂection properties from the image. Haro et al. use a photograph of a sphere or ellipsoid of the source material as the exemplar, sample from it non-uniformly, and coherently produce a realistic rendering of an input 3D model from a ﬁxed viewpoint and light direction. They use Image Analogies [HJO⁺01] as their matching algorithm, which will consider normal and texture simultaneously as feature vectors to synthesize a 3D surface.

The key idea is that the shape, i.e. sphere or ellipsoid, provide enough coverage of the complete set of normals for recovery, and these can be used as a good estimate for rendering a complex surface with the same material. Their method is very simple, but ﬁlls the niche between pure 2D texture synthesis methods and full BTF synthesis.

2.2 Photometric stereo

For image-based rendering, we may need to know the underlined surface geometry and lighting condition. Photometric Stereo was first introduced by Woodham[Woo80]. It gives us a hint that we can estimate local surface orientation by different shaded images. Under different illumination, the intensities of the same facet depend on both the local surface normal and the lighting direction. Therefore, we can reconstruct the underlying geometry and material based on these images.

Goldman et al. [GCHS04] extend the work taken by Hertzmann et al. [HS03] and propose a method to extract per-pixel BRDFs along with 3D shape. Their captured photographs are with the same viewpoint and diﬀerent illumination. They observed that the materials can be depicted as a combination of diﬀerent fundamental materials. Therefore, they can reconstruct the shape and the spatially-varying BRDFs by optimizing BRDF parameters of an objective function. Based on the shape and the material reconstructed by their method, it can also work for illumination editing and material transfer between models.

2.3 Texture synthesis over surfaces

There are many methods to wrap a rectangular texture image onto a surface. Diﬀerent nearest-neighbor search methods has been shown to produce results of good quality. Their goal is to synthesize textures on the surface directly, avoid noticeable seams between synthesized patches and minimize the distortion of the texture pattern. Wei and Levoy [WL01]

demonstrated the ability to synthesize general textures over arbitrary manifold surfaces with dense polygon meshes. With a similar idea, Turk [Tur01] used the vector ﬁeld speciﬁed by

(18)

users to indicate the orientation of the texture for synthesizing. Considering a general type of texture, Ying [YHBZ01] synthesize textures on surfaces from examples. Although all of the work produced good results in synthesis, they require more information about the polygon meshes in their algorithms.

Texture synthesis has been widely researched for years. With copying small patches repeatedly, texture synthesis can generate a larger image for practical rendering. Different from traditional synthesis methods, Lefebvre et al. [LH05] introduced a novel texture syn- thesis scheme based on neighborhood matching. This algorithm can be adapted to parallel synthesis for creating infinite texture without the limited variety of tiles. For each exemplar, they use coordinates instead of pixel values to synthesize texture into a coarse-to-fine pyramid. We borrow such an idea from their work in synthesis. Besides, for texture variability, they can add jitter and more user control to perturb the coordinates which will determine the resulting appearance.

Recently, several related photograph editing techniques are proposed to easily manipu- late the appearance from images. With single photograph, Fang et al. [FH04] developed a sophisticated tools of shape-from-shading and distorted graphcut textures combined to conveniently and robustly synthesize textures on objects in photographs. This method uses a simple Lambertian reﬂectance model to reconstruct the surface normal of small patches and to segment these patches by the reconstructed normal. Warping each segmentation of these patches according to their orientation can apply textures to other objects, and using graphcut technique to generate seamless textures. However, this method is limited to simple lighting conditions with smooth diﬀuse surfaces, because the recovered normal is not accu- rate if the shading is much more complex. Like the work of Fang et al. [FH04], Khan et al.

[KRFB06] proposed an automatically texture and material editing method. They assume that human vision are very tolerant of physical in accuracies, such as shadow, translucency, and transparency. Based on such an assumption, they estimated surface geometry and environment lighting condition with simple approaches from single photograph. They can then apply new texture or other reﬂectance properties, such as BRDF or translucency, on the recovered surface. The results are very appealing, though they are not physically correct.

(19)

Chapter 3 Acquisition

In this chapter, we will brieﬂy describe the acquisition process of our system. The acquisition and the preprocess are very simple. We only need to capture several (at least three) photographs at the same viewpoint with diﬀerent lighting directions. After capturing these photographs, we will recover the normal map and the unshaded image in preprocess.

3.1 Setup and acquisition

Our acquisition procedure is described as follows. The acquired object is located in a black box and illuminated by a distant point light source at diﬀerent incident directions. A mirror ball is then put beside the acquired object for light calibration. The camera is orthographic and the light source is a distant light. We assume that the target is a Lambertian object and the black box will diminish the eﬀect of the ambient light. The distance between the object and the camera is about 2 meters, and the distance between the object and the light is about 1 meters. During the acquisition, we capture one image per light direction, and there are total of about 9 photographs.

3.1.1 Light calibration

After acquisition, we need to estimate the light direction for each captured photograph.

For light calibration, we locate the brightest pixel on the mirror ball and trace the lighting direction from the pixel. We assume that the mirror ball is a perfectly reﬂected sphere. The location of the brightest pixel p represents the corresponding point where total reﬂection occurs at the surface tangent plane. The light direction L is then on the opposite side of the normal N from the viewpoint V , i.e. (0, 0, 1), as shown in Figure 3.1. We can then compute

(20)

N

L V

mirror ball surface

tangent plane p

Ӱ Ӱ

Figure 3.1: Light Calibration L from N and V :

L = N cos(θ) + (N cos(θ)− V )

= 2N cos(θ)− V

= 2N (N · V ) − V

3.2 Photometric stereo

The goal in preprocess is to recover the normal map and the unshaded image of the target object. Because we assume that the target is a diﬀuse object, for normal recovery and unshading, we follow the Photometric Stereo [Woo80] approach:

I = ρN · L −→ I· L⁻¹ = ρN so that, ρ = |I · L⁻¹|

N = ¹_ρI· L⁻¹

where I is the intensity, L = [L_xL_yL_z] is the light direction, N = [N_xN_yN_z] is the normal, and ρ is the albedo for each pixel. The N is the desired normal map, and we take ρ as our unshaded image, since we assume that the source is a diﬀuse object. We need only three

(21)

Figure 3.2: Photometric Stereo. The left image is one of the captured ”melon” photographs, the middle one is the recovered normal map, and the right one is the unshaded image.

images to recover N , and the inverse L⁻¹ exists if and only if the three vectors L_i do not lie in the same plane. In our case, for robustness, we prefer to capture more than three photographs to solve an over-determined linear system:

I₁ I₂ . . . I_n

⎡

⎢⎢

⎢⎣

L_1x L_1y L_1z L_2x L_2y L_2z ... ... ... L_nx L_ny L_nz

⎤

⎥⎥

⎥⎦

−1

= ρ N_x N_y N_z

(22)

(23)

Chapter 4 Photograph relighting

After acquisition, we have several samples of an static object at a fixed viewpoint with different lighting directions. In this chapter, we present a new idea for pseudo-BTF reconstruction. We will first cluster all pixels into different texture points (texels). Therefore, each cluster can form shading maps from those samples of photographs for later rendering.

We can then illuminate the object with other lighting conditions or apply any BRDFs on it for altering its appearance with diﬀerent reﬂective behavior by simple table-look-up of these shading maps.

4.1 Texel clustering

Our observation is from the texture mapping technique in the traditional computer graphics applications. If we would like to display an object with complex appearance, the easiest way is to glue a texture image with such a pattern. This texture image will be duplicated and pasted on the surface at diﬀerent location according its texture coordinates. However, for real objects, we cannot ﬁgure out the texture coordinate for each pixel. Our approach is from the clustering and labeling method.

The input of our texel clustering method is an unshaded image of the source object, i.e.

albedos. We suppose that there should be a certain texture pattern laid on its surface. If we can reconstruct the underlying texture by analyzing its texels, we can retrieve the texture for further rendering or texture replacement. We assume that different texture points (texels) have different albedos and have coherence with their neighbors on its appearance. The goal of texel clustering is to recognize different texels and to group them, as shown in Figure 4.1.

We take a two-step approach for our clustering method:

(24)

Figure 4.1: Texel Clustering

• group all pixels by their albedos

• cluster each pixel of the same albedo by neighborhood matching

We ﬁrst group pixels only by albedos. For acceleration, we use k-means algorithm to minimize a total intra-cluster variance:

V =

k i=1

j∈Ci

| x_j − u_i |²

where there are k clusters C_i, i = 1, 2, . . . , k, u_i is the centroid point of the cluster, and

x_j = (R_j, G_j, B_j)∈ Ci.

After picking up pixels with the same albedo, we will further cluster them by neighbor- hood matching, which makes the texel more consistent. We also use k-means algorithm to cluster pixels as above, but here x_j =(R_t, G_t, B_t), where t = neighbor(j).

By this two-step clustering method, we can cluster all pixels into several groups, and each group represents an texel of the underlying texture of the object. We will give each group an unique label, i.e. index. Therefore, we can relight the object or synthesize the texture on another object by the indices to the texels.

4.2 Shading maps

Before rendering the object, we will generate shading maps S₍k, r) = (R, G, B) ﬁrst, where k is the index of a certain texel and r corresponds to the rendering coeﬃcient, e.g. N · L.

Shading maps are like a stack of images of the underlying texture under diﬀerent lighting conditions. Once we want to render an object with such a texture by a speciﬁc rendering

(25)

Shading Maps

Camera Viewer

r

Acquisition Rendering

Figure 4.2: Shading Maps.

Figure 4.3: Pseudo BTF reconstruction. Applying shading maps on the original object. The radiance is increasing linearly for each image.

coeﬃcient r-value, rendering can be easily achieved by a simple table-look-up into these shading maps, as shown in Figure 4.2.

After clustering pixels by separate texels, for each texel, we can collect all pixels with diﬀerent normals and lighting directions from input images. We will then construct a sorted list from this collection by N· L. Re-sampling from this list uniformly can generate a series variation of such a texel from dark to bright. It is done by querying diﬀerent but uniformly distributed r_i with linear interpolation, extrapolation, or even scaling from this sorted list.

r_i = i

N, i = 0, 1, . . . , N,

where N is the number of the acquired levels of the shading maps. Repeating all texels forms shading maps, which can also be seen as a sparse set of the texture under diﬀerent rendering

(26)

coeﬃcients, as seen in Figure ??.

4.3 Relighting

After generating shading maps, rendering with diﬀerent lighting conditions is fairly simple.

At first, we will compute rendering coefficient r for each pixel p. Although we assume diffuse objects during shading maps construction, we can use any other BRDFs for r instead, such as Phong model, as well as Lambertian model, i.e. N · L, as shown in Figure 5.2. As stated before, after deciding rendering coefficient r, we do a table-look-up into our shading maps S(index(p), r), and copy the (R, G, B) values to the target object. It also performs linear interpolation of different r for smoothing with varied lighting conditions. If the desired value r is beyond the maximum or under the minimum in the shading maps, we can either clamp r_k to the maximum or minimum or scale the pixel value for the desired r.

(27)

Figure 4.4: Relighting. These images are the results of relighting under diﬀerent illumination model and diﬀerent light directions, using constructed shading maps. The left columns are Lambertian model, while the right ones are Phong model.

(28)

(29)

Chapter 5 Texture replacement

In this chapter, we would like to replace the texture of the target object, using the index map generated by our texel clustering method. Diﬀerent from other texture synthesis methods, we synthesize the index map for the target object instead of the pixel values directly. In this way, we can change the illumination after synthesis as the same as in the previous chapter.

5.1 The approach

After clustering pixels into each texel, we can almost retrieve the texture of the source object.

For texture replacement, we use the index map of the source object to generate that of the target object by traditional 2D texture synthesis algorithms. In order to match the surface geometry with the texture, here we use normals as well as the texture coherence (albedos) for synthesizing since we assume that the source object provides enough coverage of the complete samples of the normals. Besides, for fluctuant surfaces, we will blur the normal map in preprocess by some low-pass filter to eliminate the influence of the high-frequency details and to maintain the coherent structure. Before synthesizing, we will first encode normal maps into color images, i.e. (R, G, B) maps to (X, Y, Z). As a result, we can perform matching in 2D images (the encoded images) instead of over the 3D surface.

5.1.1 Image analogies

In order to preserve both small and large scale for ﬁne-detailed surfaces, we use neighborhood matching in multi-resolution pyramids. As the same as [HE03], we follow Image Analogies [HJO⁺01] as our matching algorithm, which could be seen in Figure 5.1. Our texture transfer algorithm takes a set of four images as input: the normal map A of the source object, the

(30)

Figure 5.1: Image Analogies. A and A’ are the normal maps of the source and target objects, while B and B’ are the unshaded images of the source and target object.

unshaded image B of the source object, the index map I of the source object, and the normal map A of the target object, and it will produce the unshaded image B and the index map I of the target object as output.

The matching procedure are described as follows. At ﬁrst, in the initialization step, three image pyramids (multi-scale representation) are constructed for the normals A and albedos B of the source object and the normals A of the target object. The synthesis algorithm then proceeds from the coarsest level to the ﬁnest level, one level at a time, computing each one of the unshaded image B of the target object. At each level l, the feature vectors F (B_q) is constructed for each pixel q on the source object before synthesis. Each pixel p to be synthesized on the target object B will then be compared against for each q by these feature vectors in scan-line order. The matching procedure have two phases:

• global search, which attempts to ﬁnd the closest pixel with the minimum distance according to the feature vectors F (B_p) and F (B_q)

• coherent search, which tries to preserve the coherence with the synthesized pixels in its neighbors

(31)

For the global search, we use ANN [MA97], a library for approximate nearest neighbor searching, to accelerate the matching procedure. For the coherence search, based on Ashikhmin’s approach [Ash01], each synthesized pixel around the pixel generates a shifted candidate according to its position. The best match pixel will then be chosen among the candidates from the global search and the coherence search. This is done for each pixel at each pyramid level, resulting in a synthesized image of the target object. Moreover, at the last step in the synthesis procedure, we will also copy the corresponding indices I of the source to the target instead of the albedo itself. Therefore, we can get a new index map I of the target object with the original texture for more rendering artifacts.

5.1.2 Feature vectors

For each pixel, a feature vector F are constructed from normals and albedos in the neigh- borhoods. At the coarsest level, F is composed of normals within the m-by-m neighborhood and albedos with the causal neighborhood (L-shape) around the pixel. At other levels, F is composed of normals and albedos within the n-by-n neighborhood in the previous level, normals within the m-by-m neighborhood and albedos with the causal neighborhood in this level. Such a feature vector will be computed for all pixels while searching and result in a very long dimensional array for matching. It not only ensures the consistency of the surface geometry but also preserves the texture coherence of the source object in more details.

5.2 Rendering

After synthesizing texture (index map) on the surface of the target object, we can render it with different illumination models as we did in the previous chapter. At first, we will also generating shading maps from those input images, the source object. We can then estimate the variation from dark to bright for each texel. During rendering the target object, we use the synthesized index map to look up the shading maps. With different lighting conditions or with different light reflection models, we may have different behavior to render this target object with the shading maps.

(32)

Figure 5.2: Results. These images are the results of relighting the synthesized target object under Lambertian model (left column) and Phong model (right column) with diﬀerent light directions .

(33)

Chapter 6 Results

We have implemented our algorithm on an Intel P4 3.2 GHz personal computer, and we will present the results with four synthetic objects and four real ones as follows. Our method takes about 5 minutes for clustering texels and 3∼5 hours for transferring textures with images of resolution 512x512 pixels and with size of window 5x5 pixels. The rendering computation is close to real-time performance (30 FPS) with 10 samples of shading maps under illumination of 1 point light source.

6.1 Synthetic objects

Before presenting the ﬁnal results, we ﬁrst use some synthetic objects to verify our algorithms.

We expect that these objects will get better results because we can obtain near perfect normal maps and unshaded images. In Figure 6.1, we show several results on the same object with diﬀerent kinds of texture. The source teapot are textured with diﬀerent images in OpenGL, using its texture coordinates. These textures are almost stochastic textures, except the last one, ”brick”. We think that our synthesis method generates better results for stochastic textures than for structured ones. As it is shown, our method can produce realistic results for those textured target objects.

6.2 Real objects

We show the results of real objects in Figure 6.2. The ﬁrst three objects are plastic balls wrapped with diﬀerent pieces of cloth. There are two reasons for us to choose these source objects. One is because the sphere shape can provide enough coverage of normals for texture

(34)

Figure 6.1: Texture transfer and relighting for synthetic objects. The images in the left column are rendered in OpenGL as our source objects. The images in the middle and right columns are generated by our method and rendered in Lambertian Model and in Phong Model

(35)

Figure 6.2: Texture transfer and relighting for real objects. The images in the left column are the source objects. The images in the middle and right columns are generated by our method and rendered in Lambertian Model and in Phong Model respectively.

(36)

transfer, and the other is that such an object is more available. In the last example, we use a melon as our source object. The melon has an appearance with an irregular net structure which is very diﬀerent with others.

6.3 Environment mapping

Except relighting with single point light source, we can also apply Environment Mapping on our target objects, as shown in Figure 6.4. This will produce more vivid results with complex lighting conditions from the around environment.

(37)

Figure 6.3: Texture transfer and relighting for real objects. The images in the left column are the source objects. The images in the middle and right columns are generated by our method and rendered in Lambertian Model and in Phong Model respectively.

Figure 6.4: Rendering with environment mapping.

(38)

(39)

Chapter 7 Conclusion and future work

7.1 Conclusion

In this thesis, we present an algorithm about texture synthesis for image relighting and texture transfer. The main contribution of our work is that we keep the texture consistency and make it relightable during synthesis. We perform Photometric Stereo to retrieve the normal map and the unshaded image from input photographs. The texel clustering will then guess the underlying distribution of texture points. With the knowledge of these texels, we can alter the illumination and achieve real-time rendering by the construction of shading maps.

Besides, we can also replace the texture on other surfaces with these texels. Although we make some assumptions about the input photographs and the recovered reﬂectance properties are not as physically correct as in the real world, the results are still unnoticeable for human eyes.

Our work provides a brand-new point of view for image-based rendering, and the texel clustering method also attempts to proved another approach for texture generation. Instead of interpolation for complete BTF reconstruction as in the previous work, we take the approach of synthesis to analyse the reﬂectance behavior and are able to perform a pseudo but fast rendering under varied illumination.

7.2 Future work

There are still many interesting problems to be discussed for making this work more practical.

We are interested in using this approach to produce more vivid results and apply other artifacts from real objects. Although we can replace the texture from the source object on

(40)

the target one, we would like to transfer more shading artifacts to make the target more realistic, such as bump mapping or displacement mapping. We think this may require more information in constructing feature vectors for matching, and the synthesis algorithm we used now will probably have to be modiﬁed so that it may be able to work for accurate normal map propagation. Besides, in preprocessing, we only use simple illumination model to be ﬁt by our captured photographs. The specular highlights or self-shadow cannot be handled very well, and it may produce incorrect results. We should use more complex models to analyse the source object such that we can get more accurate normal maps and unshaded images for better results.

As shown in the previous results, our method works well for stochastic textures. However, for regular textures, a better synthesis algorithm may be helpful to maintain the structured pattern of the texture. On the other hand, like other texture synthesis methods did, most of them focus on the issues about 2D images stitching. Although the input and output to our system are also 2D images, we think maybe we can try to apply our work on real 3D surface.

The idea is from mesh parameterization, for example, geometry image. If we can project the object on the texture space, we can do our work for its whole body, like texture coordinate generation. We can also use more traditional texture synthesis methods for better results.

Therefore, after re-projecting back to 3D domain, we may rotate the object and change diﬀerent viewpoints instead of only in the front of it.

(41)

References

[Ash01] Michael Ashikhmin. Synthesizing natural textures. In Symposium on Interactive 3D Graphics, pages 217–226, 2001.

[DvGNK99] Kristin J. Dana, Bram van Ginneken, Shree K. Nayar, and Jan J. Koenderink.

Reﬂectance and texture of real-world surfaces. ACM Transactions on Graphics, 18(1):1–34, 1999.

[FH04] Hui Fang and John C. Hart. Textureshop: Texture synthesis as a photograph editing tool. In Proc. SIGGRAPH 2004, 2004.

[GCHS04] Dan B. Goldman, Brian Curless, Aaron Hertzmann, and Steve Seitz. Shape and spatially-varying brdfs from photometric stereo. Technical report, University of Washington, 2004.

[HE03] A. Haro and I. Essa. Exemplar based surface texture. In Proc. Vision, Modeling, and Visualization 2003, 2003.

[HJO⁺01] Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin. Image analogies. In Proc. SIGGRAPH 2001, pages 327–340, 2001.

[HS03] Aaron Hertzmann and Steven M. Seitz. Shape and materials by example: A photometric stereo approach. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, volume 1, pages 533–540, 2003.

[Kaj86] James T. Kajiya. The rendering equation. In SIGGRAPH ’86: Proceedings of the 13th annual conference on Computer graphics and interactive techniques, pages 143–150, New York, NY, USA, 1986. ACM Press.

[KRFB06] Erum Arif Khan, Erik Reinhard, Roland Fleming, and Heinrich Buelthoﬀ.

Image-based material editing. In SIGGRAPH, 2006.

(42)

[KSS⁺04] J. Kautz, M. Sattler, R. Sarlette, R. Klein, and H.-P. Seidel. Decoupling brdfs from surface mesostructures. In Proc. Graphics Interface 2004, pages 177–184, 2004.

[LH05] Sylvain Lefebvre and Hugues Hoppe. Parallel controllable texture synthesis.

ACM Trans. Graph., 24(3):777–786, 2005.

[LLH04] Yanxi Liu, Wen-Chieh Lin, and James H. Hays. Near regular texture anal- ysis and manipulation. ACM Transactions on Graphics (SIGGRAPH 2004), 23(3):368 – 376, August 2004.

[LYS01] Xinguo Liu, Yizhou Yu, and Heung-Yeung Shum. Synthesizing bidirectional texture functions for real-world surfaces. In SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 97–106, New York, NY, USA, 2001. ACM Press.

[MA97] D. Mount and S. Arya. Ann: A library for approximate nearest neighbor searching, 1997.

[MMS⁺04] G. M¨uller, J. Meseth, M. Sattler, R. Sarlette, and R. Klein. Acquisition, synthesis and rendering of bidirectional texture functions. In Christophe Schlick and Werner Purgathofer, editors, Eurographics 2004, State of the Art Reports, pages 69–94. INRIA and Eurographics Association, September 2004.

[MPBM03] Wojciech Matusik, Hanspeter Pﬁster, Matt Brand, and Leonard McMillan. A data-driven reﬂectance model. ACM Trans. Graph., 22(3):759–769, 2003.

[PCF05] J. A. Paterson, D. Claus, and A. W. Fitzgibbon. Brdf and geometry cap- ture from extended inhomogeneous samples using ﬂash photography. Computer Graphics Forum (Special Eurographics Issue), 24(3):383–391, September 2005.

[Tur01] Greg Turk. Texture synthesis on surfaces. In SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 347–354, New York, NY, USA, 2001. ACM Press.

[War92] Gregory J. Ward. Measuring and modeling anisotropic reﬂection. In SIG- GRAPH, pages 265–272, 1992.

[WL01] Li-Yi Wei and Marc Levoy. Texture synthesis over arbitrary manifold surfaces.

In SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer

(43)

7.2. FUTURE WORK 33

graphics and interactive techniques, pages 355–360, New York, NY, USA, 2001.

ACM Press.

[Woo80] R.J. Woodham. Photometric method for determining surface orientation from multiple images. Optical Engineering, 19(1):139–144, January 1980.

[YHBZ01] Lexing Ying, Aaron Hertzmann, Henning Biermann, and Denis Zorin. Texture and shape synthesis on surfaces. In Proceedings of the 12th Eurographics Work- shop on Rendering Techniques, pages 301–312, London, UK, 2001. Springer- Verlag.

Coherent Texture Synthesis for Photograph Relighting and Texture Replacement

Master Thesis