Motivation

The modification of images in a way that is non-detectable for an observer who does not know the original image is a practice as old as artistic creation itself. Medieval artwork started to be restored as early as the Renaissance. This practice is called retouching or inpainting.

Traditionally, skilled artists have performed image inpainting manually. With the rapid development of digital life, the automatic inpainting techniques are developed.

The applications of image inpainting include the restoration of photographs (Fig. 1-1), films and paintings, the removal of objects (Fig. 1-2 to 1-3) and occlusions, such as text, subtitles (Fig. 1-4), stamps and publicity from images, image coding [5] (The objective is to retain only the information which cannot be correctly reconstructed “minute but important details” and to remove as much as possible from the remainder of the image. After data has been transmitted, using inpainting method to reconstruct the image. Fig. 1-5 shows an example. It reduces about 25% data transmission). In addition, inpainting can also be used to produce special effects (Fig. 1-6) and attack against visible watermarking [17].

Because the applications of image inpainting are living, how to effectively and correctly inpaint an unknown region has become an important research issue. In other words, image inpainting has become a paramount research topic in recent year. We are interested in

knowing the well-known inpainting methods. The basic idea is to use undamaged neighboring information to inpaint damaged regions. In section 1.2, we will introduce several well-known inpainting methods and then direct several defects of them. In section 1.3, we will propose a new method that can improve these defects and get better performance.

Fig. 1- 1 Restoration of photographs

Fig. 1- 2 Removal of trees

Fig. 1- 3 Removal of a girl

Fig. 1- 5 Applications of image coding

Fig. 1- 6 Special effects 1.2 Related Works

There are several researches of image inpainting. Bertalmio et al [1] have introduced a technique for digital inpainting of still images that produces very impressive results. Based on partial differential equations (PDEs), the algorithm fills in the damaged areas to be inpainted by propagating information from the outside of the masked region along level lines

(Isophotes).

However, the algorithm usually requires several minutes on current personal computers for the inpainting of relatively small areas. Such a time is unacceptable for interactive sessions and motivated Manuel M. Oliveira, Brian Bowen, Richard McKenna and Yu-Sung Chang [2]

to design a simpler and faster algorithm capable of producing similar results in just a few

seconds. It uses a weighted average kernel that only considers contributions from the neighboring pixels. Efficiency of the fast inpainting algorithm is two to three orders of magnitude faster than those using partial differential equations. But it constraints the regions to be inpainted must be locally small. If the damaged regions are not small enough, some possibly important information might be discarded. Results might be blurred.

For inpainting large damaged regions well, Efros and Leung et al. [3] proposed a texture synthesis algorithm to solve the problem. In their algorithm, the lost region is filled-in pixel by pixel with the texture from its neighbors. This algorithm is considerably faster when using the improvements in [10], [18], [19], [20].

Inspired by the work of Efros and Leung [3], Criminisi, Pérez and Toyama proposed an exemplar-based texture synthesis algorithm [7] for removing/inpainting large objects from digital images. Their approach employs an exemplar-based texture synthesis technique modulated by a unified scheme for determining the fill order of the target region. Pixels

maintain a confidence value, which together with image isophotes, influence their fill priority.

Viewing from above inpainting techniques, “Fast Digital Image Inpainting” [2] can fast inpaint small damaged regions, but it can’t work well for large damaged regions. “Region Filling and Object Removal by Exemplar-Based Image Inpainting” is also called “Priority Texture Synthesis” [7]. It can both inpaint small and large damaged regions well, but it spend too much time. Therefore, how to correctly and quickly inpaint these images with several small and large damaged regions becomes the goal of our paper.

1.3 Thesis Organization

In this thesis, we propose an image inpainting method that will combine “Fast Digital Image Inpainting” and “Priority Texture Synthesis”. The remainder of this thesis is organized as follows. In chapter 2, we will survey the research of image inpainting and discuss some issues needed to concern. Then, we will describe how to evaluate the similarity value between

two textures. Finally, we will survey the concept of morphological operations. In chapter 3, we will present our method which uses the morphological operator “opening” to split the damaged regions of images into several small and large parts according to the structuring element which users set up. After splitting, we modify the “Fast Digital Image Inpainting”

algorithm and apply it to inpaint small damaged parts. Then, we use “Priority Texture Synthesis” to inpaint large damaged parts and add the FFT block matching algorithm to speedup the time of searching similar textures. In chapter 4, we will experiment with different kinds of damaged images. The proposed method can efficiently reduce the cost of

computation. Experimental results look “reasonable” to the human eye. Then, we will

compare the performance of our method with other methods. In chapter 5, the conclusion and future work will be stated.

CHAPTER 2 Previous Research

In this chapter, we will describe several related researches about image inpainting in section 2.1. In section 2.2, we will describe the block matching problem and introduce a faster algorithm for solving it. In section 2.3, the concept of PSNR will be introduced. Basic morphological operators will be described in section 2.4.

2.1 Review of Digital Image Inpainting

The image inpainting methods are widely used in various fields such as wireless communication, reverting deterioration of photographs, image coding (e.g., recovering lost blocks) and special effects (e.g., removal of objects), etc. The basic idea behind the methods that have been proposed in the literature is to fill-in these regions with available information from their surroundings. In this section, we will describe some existed image inpainting algorithms.

2.1.1 Image Inpainting Based on Partial Differential Equation (PDE)

Bertalmio et al [1] pioneered a digital image-inpainting algorithm based on partial differential equations (PDEs). For the damaged image, it fills in the areas to be inpainted by propagating information from the outside of the masked region along level lines (Isophotes).

Isophote directions are obtained by computing at each pixel along the inpainting contour a discretized gradient vector (it gives the direction of largest spatial change) and by rotating the resulting vector by 90 degrees (Fig. 2-1 [1]). This intends to propagate information while preserving edges. A 2-D Laplacian [21] is used to locally estimate the variation in color smoothness and such variation is propagated along the isophote direction [1]. After every few

Ω

step of the inpainting process, the algorithm runs a few diffusion iterations to smooth the inpainted region.

Isophotes

Fig. 2- 1 The figure describes what is Isophote Directions . Propagation direction as the normal to the signed distance to the boundary of the region to be inpainted.

The algorithm of that form can be written as :

(2.1) where the superindex n denotes the inpainting “time”, (i, j) are the pixel coordinates, ∆t is the rate of improvement and

I

ⁿ^t( ji, ) stands for the update of the image

I

ⁿ( ji, ). Note that the evolution equation runs only inside Ω, the region to be inpainted.

With this equation, the image

I

ⁿ⁺¹(i,j) is an improved version of

I

ⁿ( ji, ), with the

where )n( ji,

δ

^⎯

L

^⎯→ is a measure of the change in the information . With this equation, we estimate the information of our image and compute its change along the

)

direction. Note that at steady state, that is, when the equation (2.1) converges, )

meaning exactly that the information L has been propagated in the direction

N

⎯→

⎯

. For more details, we now rewrite the equation (2.2) as :

)

On the implementation, we can use discrete equation borrows from the numerical analysis literature. See Section 2.1.1.2, Section 2.1.1.3 and Section 2.1.1.4.

2.1.1.2 How to calculate n( ji, ) . Let be an image smoothness estimator. For this purpose we may use a

simple discrete implementation of the Laplacian: . Other smoothness estimators might be used, though satisfactory results were already obtained with this very simple selection.

)

2.1.1.3 How to calculate

means the isophote direction. We can compute and

by using some gradient operators (Fig. 2-2)

)

Fig. 2- 2 Some gradient operators. (a) is Prewitt operator. (b) is Sobel operator.

2.1.1.4 How to calculate

∇ I

ⁿ( ji, ) ^?

( ) ( ) ( ) ( )

normalized vector

N

⎯→

⎯

, that is, we compute the change of L along the direction of

N

⎯→

⎯

. Finally, we multiply by a slope-limited version of the norm of the gradient of the image.

A central differences realization would turn the scheme unstable, and that is the reason for using slope-limiters. The subindexes b and f denote backward and forward differences respectively, while the subindexes m and M denote the minimum or maximum. See [27] for details.

β

ⁿ

2.1.2 Fast Digital Image Inpainting

A fast inpainting algorithm was proposed in [2], based on image diffusion kernels. Let Ω be a small area to be inpainted. The simplest version of the algorithm consists of initializing Ω by clearing its color information and repeatedly convolving the region to be inpainted with a diffusion kernel. It uses a weighted average kernel that only considers contributions from the neighboring pixels. Fig. 2-3 [2] shows the pseudocode of the algorithm and two diffusion kernels.

InitializeΩ ;

for (iter =0; iter < num_iteration; iter++) convolve masked regions with kernel;

a b a b 0 b a b a

c c c c 0 c c c c

Fig. 2- 3 (top) Pseudocode for the fast inpainting algorithm. (bottom) Two diffusion kernels used with the algorithm.

a= 0.073235, b = 0.176765, c = 0.125.

Efficiency of the fast inpainting algorithm [2] is two to three orders of magnitude faster than those using partial differential equations. But it constraints the regions to be inpainted must be locally small. If the damaged regions are not small enough, some possibly important information might be discarded. Results might be blurred.

We use the fast inpainting algorithm to inpaint the following pictures. Fig. 2-4 and 2-5 are the examples of inpainting locally small damaged regions. The results look good. Fig. 2-6 is the example of inpainting large damaged regions. The result is blurred.

Fig. 2- 4 Removal of subtitles

Fig. 2- 5 Lena : (left) Picture with locally small damaged regions. (right) Restored image obtained with fast inpainting algorithm.

Fig. 2- 6 Lena : (left) Picture with large damaged regions. (right) Restored image obtained with fast inpainting algorithm.

2.1.3 Texture Synthesis

For inpainting large damaged regions well, Efros and Leung et al. [3] proposed an algorithm to solve the problem. In this algorithm, the lost region is filled-in pixel by pixel with the texture from its neighbors. As illustrated in Fig. 2-7 [3], when filling the pixel

, the algorithm first defines a )

, ( ji

p 3×3 template

I

t next to p( ji, ), and looks for a

I

in the available neighboring blocks such that ^d(

I

^t,

I

^{^}^t) is minimized, where ^d(

I

^t,

I

^{^}^t)^is

defined as the normalized sum of squared differences (SSD) metric (More details can be found in section 2.2). Once the nearest template is found, we copy the pixel (candidate) in the correspondent position to our pixel (current pixel) to be filled-in. Fig. 2-8 shows the

algorithm’s results. This algorithm is considerably faster when using the improvements in [10], [18], [19], [20].

Best match

8-neighborhood of lost block

Candidate

Lost Block Template

Current pixel

Fig. 2- 7 Texture Synthesis Procedure

Fig. 2- 8 Lena : (left) Picture with large damaged regions. (right) Restored image obtained with texture synthesis algorithm.

2.1.4 Priority Texture Synthesis

Inspired by the work of Efros and Leung, Criminisi, Pérez and Toyama proposed an exemplar-based texture synthesis algorithm [7] for removing large objects from digital images.

Their approach employs an exemplar-based texture synthesis technique modulated by a

unified scheme for determining the fill order of the target region. Pixels maintain a confidence value, which together with image isophotes, influence their fill priority. Figure 2-9 [6] shows the pseudocode of the algorithm.

The algorithm

1. Manually select region Ω for removal (Fig. 2-10 a) 2. Repeat until region Ω is empty

1. Compute boundary dΩ and priorities P(p) 2. Propagate texture and structure information

1. Find patch Ψp on dΩ with highest priority P(p) (Fig. 2-10 b) 2. Find patch Ψq in Φ which minimizes SSD(Ψp, Ψq) (Fig. 2-10 c) 3. Copy Ψq to Ψp (Fig. 2-10 d)

3. Update confidence values (More details can be found in section 2.1.4.4)

Fig. 2- 9 The pseudocode of the exemplar-based texture synthesis algorithm.

Fig. 2- 10 Structure propagation by exemplar-based texture synthesis.

target region Ω, its contour δ , and the source region Φ clearly marked. (b) We want to Ω synthesize the area delimited by the patch Ψp centered on the point p∈δΩ. (c) The most likely candidate matches for Ψp lie along the boundary between the two textures in the source region, e.g., and . (d) The best-matching patch in the candidates set has been copied into the position occupied by Ψp, thus achieving partial filling of . Notice that both texture and structure (the separating line) have been propagated inside the target region. The target region has, now, shrunken and its front

Ψ

^q^'

Ψ

^q^''

Ω

Ω δ has assumed a different shape. Ω

2.1.4.1 How to calculate P(p)?

Fig. 2- 11 Notation diagram.

Fig. 2-11 shows the notation diagram [7]. Given a patch Ψq centered at the point p for some p∈dΩ,

n

p is the normal to the contour δ of the target region _Ω and is the isophote (direction and intensity) at point p. The entire image is denoted with . We define its priority P(p) as the product of two terms P(p) = C(p) D(p), We call C(p) the confidence term and D(p) the data term, and they are defined as follows:

Ω ^∇

I

^⊥^p

| grey-level image), (More details can be found in section 2.1.4.2) is a unit vector orthogonal to the front dΩ in the point p, and

n

⊥ (More details can be found in section

2.1.4.3) denotes the orthogonal operator. The priority P(p) is computed for every border patch, with distinct patches for each pixel on the boundary of the target region. During initialization, the function C(p) is set to C(p) = 0 ∀p∈Ω and C(p) = 1 ∀p∈Ι−Ω.

2.1.4.2 How to calculate

n

p ?

Given a point p∈dΩ, the normal direction is computed as follows : 1) the positions of the “control” points of dΩ are filtered via a bidimensional Gaussian kernel (Fig. 2-12) and, 2) is estimated as the unit vector orthogonal to the line through the preceding and the successive points in the list. Alternative implementation may make use of curve model fitting [23].

Fig. 2- 12 A bidimensional Gaussion Kernel filter

2.1.4.3 How to calculate ^∇

I

^⊥^p ?

I

^⊥^p

∇ denotes the orthogonal gradient vector. The gradient is computed as the maximum value of the image gradient in

I

∇ Ι

∩

Ψp . Robust filtering techniques may also be employed here, such as using some gradient operators (Fig. 2-13). After computing , (-Gy, Gx) is the orthogonal gradient vector .

I

Fig. 2- 13 Some gradient operators. (a) is Prewitt operator. (b) is Sobel operator.

2.1.4.4 How to update confidence value ?

After the patch has been filled with new pixel values, the confidence C(p) is updated in the area delimited by

Ψ

^p^{^}

This simple update rule allows us to measure the relative confidence of patches on the fill

front without image-specific parameters. As filling proceeds, confidence values decay, indicating that we are less sure of the color values of pixels near the center of the target region.

2.2 Block Matching

The block matching problem is one that occurs in various fields of applications, such as the image processing, multimedia and vision fields. The basic idea for solving the block matching problem is to minimize some measure of similarity between a template block of pixels in the current image to all candidate blocks in the reference image within a given search range. In this section, we will describe a new algorithm [10] for solving the block matching problem which is independent of image content and is faster than other full-search methods.

2.2.1 SAD and SSD

We first introduce two most popular similarity measures. They are defined as :

∑∑

⁻

SAD means the sum of absolute difference and SSD means the sum of squared difference. B is a block size and (u, v) is a displacement vector for a candidate block relative to the template block. Lets compare the difference between SAD (2.12) and SSD (2.13). Because of its lack of multiplications, the SAD metric is far more convenient for use in hardware designs, and is therefore used almost exclusively. However, minimizing the SSD metric corresponds to maximizing the PSNR (More details can be found in section 2.3). If a maximum PSNR is desired, SSD should be the metric of choice. Therefore, we choose SSD metric to measure

2.2.2 The FFT Block Matching Algorithm

Steven L. Kilthau, Mark S. Drew, and Torsten Möller et al. [10] proposed an FFT Block matching algorithm. It is faster than other full-search methods. In order to maximize the PSNR, the algorithm minimizes the SSD metric given in Eq.2.13. Following a trivial expansion, the mathematical definition of the per-block computation is given by:

[

1 1

]

⁽²^.¹⁴⁾

Since the term appears across the entire minimum, it can be removed from the sum without affecting the resulting solution. Removing this term and separating the sum leaves us with the following equation:

)

Finally, we employ a novel data structure called the Windowed-Sum-Squared-Table and use the fast Fourier transform (FFT) in its computation of the sum squared difference (SSD) metric. Figure 2-14 shows the three basic steps of the FFT Block Matching algorithm.

The Algorithm

1. Resize input image to include a zero pad (More details can be found in section 2.2.2.1) 2. Compute the windowed sum squared table

(More details can be found in section 2.2.2.2) 3. Compute a per-block convolution sum

(More details can be found in section 2.2.2.3)

Fig. 2- 14 FFT Block Matching algorithm.

2.2.2.1 Resize input image to include a zero pad

∑ ≤ ∑ ≤

=

_k _i l j

SAT

i j f k l

f ( , ) ( , )

Given a search range of ±P we apply a zero pad of P pixels around the entire image. This simple preprocess eliminates the need for conditionals within the innermost loops of our algorithm and greatly increases its speed. In other words, it is simply to allow convenient calculation of the SSD metric without using conditionals for those search locations that lie outside of the dimensions of the original image.

2.2.2.2 Windowed Sum Squared Table

We use a variant of the well known summed area table (SAT), introduced in [24]. Given an input image f , a summed area table is a new image

f

_SAT such that

(2.16) Summed area tables can be very easily computed by applying the following recurrence, being careful to set

f ( j i , )

to zero when either of the indices is negative:

[ ⁽ ^, ⁾ ⁽ ¹ ^, ⁾ ⁽ ^, ¹ ⁾ ⁽ ¹ ^, ¹ ⁾ ]

) ,

( i j = f i j + f i − j + f i j − − f i − j −

f

_SAT _SAT _SAT _SAT ^(2.17)

Fig. 2-15 is a simple example for computing SAT.

Fig. 2- 15 A simple example for computing SAT.

The WSST differs from the SAT in that each pixel needs to represent a sum of squares, where the sum extends only over the last B × B sub-image (window), with B the block size.

[ ⁽ ^, ⁾ ⁽ ¹ ^, ⁾ ⁽ ^, ¹ ⁾ ⁽ ¹ ^, ¹ ⁾ ]

2.2.2.3 Per-Block Convolution Sum

We show that computation of the second term in Eq.2.15 amounts to the evaluation of a correlation sum for each template block, which we evaluate as a convolution sum.

∑∑

⁻

( )

In order to efficiently compute the correlation, we will convert it to a convolution (Eq.2.20) and then use the Fast Fourier transform. For each template block, we create two images of size (2P+B)*(2P+B). The first image, template , corresponds to the template block and is computed by simply multiplying the block by 2, reversing the pixels, and zero padding to the correct size. The pixel reversal effectively changes the correlation sum into an equivalent convolution sum. The second image, candidates , corresponds to the square containing all pixels of all candidate blocks in the search range. This square can be copied directly from the reference image. Given these two images, we can compute a new image, result , according to the following formula:

(2.21)

2.3 PSNR

Signal-to-noise (SNR) measures are estimates of the quality of reconstructed images compared with their original images. The basic idea is to compute a single number that reflects the quality of a reconstructed image. Reconstructed images with higher metrics are judged better. The actual metric we will compute is the peak-signal-to-reconstructed image measure which is called PSNR. The higher measures might mean better quality. Assume we are given a source image f(i,j) that contains N by N pixels and a reconstructed image F(i,j) which is reconstructed by inpainting a damaged version of f(i,j). The pixel values f(i,j) range between black (0) and white (255). We compute the mean squared error (MSE) of the

reconstructed image as follows :

[ ]

PSNR in decibels (dB) is computed as follows:

2.4 Mathematical Morphology

Mathematical morphology is a tool for extracting image components that are useful in the representation and description of region shapes, such as boundaries, skeletons, and convex hulls. Mathematical morphology is a set-theoretic method. Sets in mathematical morphology represent the shapes of objects in an image. The operations of mathematical morphology were originally defined as set operations and shown to be useful for image processing.

In general, morphological approach is based upon binary images. In binary images, each pixel can be viewed as an element in Z . Gray-scale digital images can be represented ²

在文檔中利用紋理合成及形態學運算做彩色影像修補之研究 (頁 14-0)

CHAPTER 2 Previous Research

I

I

I

I

δ

L

N

)

∇ I

( ) ( ) ( ) ( )

N

N

β

I

I

I

I

I

I

The algorithm

Ψ

Ψ

n

I

n

n

I

I

I

I

Ψ

∑∑

[

]

)

=

i j f k l

f ( , ) ( , )

f

f ( j i , )

[ ( , ) ( 1 , ) ( , 1 ) ( 1 , 1 ) ]

) ,

( i j = f i j + f i − j + f i j − − f i − j −

f

[ ( , ) ( 1 , ) ( , 1 ) ( 1 , 1 ) ]

∑∑

( )

[ ]

[ ⁽ ^, ⁾ ⁽ ¹ ^, ⁾ ⁽ ^, ¹ ⁾ ⁽ ¹ ^, ¹ ⁾ ]

[ ⁽ ^, ⁾ ⁽ ¹ ^, ⁾ ⁽ ^, ¹ ⁾ ⁽ ¹ ^, ¹ ⁾ ]