• 沒有找到結果。

Exploiting Self-Similarities for Single Frame Super-Resolution

N/A
N/A
Protected

Academic year: 2022

Share "Exploiting Self-Similarities for Single Frame Super-Resolution"

Copied!
13
0
0

加載中.... (立即查看全文)

全文

(1)

Exploiting Self-Similarities for Single Frame Super-Resolution

Chih-Yuan Yang Jia-Bin Huang Ming-Hsuan Yang Electrical Engineering and Computer Science

University of California at Merced Merced, CA 95343, USA

Abstract. We propose a super-resolution method that exploits self- similarities and group structural information of image patches using only one single input frame. The super-resolution problem is posed as learning the mapping between pairs of low-resolution and high-resolution image patches. Instead of relying on an extrinsic set of training images as of- ten required in example-based super-resolution algorithms, we employ a method that generates image pairs directly from the image pyramid of one single frame. The generated patch pairs are clustered for training a dictionary by enforcing group sparsity constraints underlying the image patches. Super-resolution images are then constructed using the learned dictionary. Experimental results show the proposed method is able to achieve the state-of-the-art performance.

1 Introduction

Super-resolution algorithms aim to construct a high-resolution image from one or multiple low-resolution input frames [1]. They address an important prob- lem with numerous applications. However, this problem is ill-posed because the ground truth is never known, and numerous algorithms are proposed with different assumptions of prior knowledge so that extra information can be ex- ploited for generating high-resolution images from low-resolution ones. Exist- ing super-resolution algorithms can be broadly categorized into three classes:

reconstruction-based, interpolation-based, and example-based approaches.

Interpolation-based super-resolution methods assume that images are spa- tially smooth and can be adequately approximated by polynomials such as bilin- ear, bicubic or level-set functions [2, 1, 3]. This assumption is usually inaccurate for natural images and thus over-smoothed edges as well as visual artifacts often exist in the reconstructed high-resolution images. These edge statistics can be learned from a generic dataset or tailored for a particular type of scenes. With the learned prior edge statistics, sharp-edged images can be reconstructed well at the expense of losing some fine textural details.

For reconstruction-based algorithms, super-resolution is cast as an inverse problem of recovering the original high-resolution image by fusing multiple low- resolution images, based on certain assumed prior knowledge of an observation model that maps the high-resolution image to the low resolution images [4, 5]. Each low-resolution image imposes a set of linear constraints on the un- known high-resolution pixel values. When a sufficient number of low-resolution images are available, the inverse problem becomes over-determined and can be solved to recover the high-resolution image. However, it has been shown that

(2)

the reconstruction-based approaches are numerically limited to a scaling factor of two [5].

For example-based methods, the mapping between low-resolution and high- resolution image patches is learned from a representative set of image pairs, and then the learned mapping is applied to super resolution. The underlying assump- tion is that the missing high-resolution details can be learned and inferred from the low-resolution image and a representative training set. Numerous methods have been proposed for learning the mapping between low-resolution and high- resolution image pairs [6–8, 3, 9–11] with demonstrated promising results.

The success of example-based super-resolution methods hinge on two ma- jor factors: collecting a large and representative database of low-resolution and high-resolution image pairs, and learning their mapping. Example-based super- resolution methods often entail the need of a large dataset to encompass as much image variation as possible [6–8, 3, 9–11] with ensuing computational load in the learning process. Moreover, the mapping learned from a general database may not be able to recover the true missing high-frequency details from the low- resolution image if the input frame contains textures that do not appear in the database. For example, the mapping function learned from low-resolution/high- resolution image pairs containing man-made objects (e.g., buildings or cars) is expected to perform poorly on natural scenes. Furthermore, the rich image struc- tural information contained in an image is not exploited. In light of this, Glasner et al. [12] propose a method that exploits patch redundancy among in-scale and cross-scale images in an image pyramid to enforce constraints for reconstructing the unknown high-resolution image.

In [10], Yang et al. present a super-resolution algorithm by employing sparse dictionary learning on high-resolution and low-resolution images. In this algo- rithm, the low-resolution images are considered as a downsampled version of high-resolution ones with the same sparse codes. Using a representative set of image patches, a dictionary (or bases) is learned for sparse coding using both high-resolution and low-resolution images. Their approach performs well under the assumption that image patches of the input image are similar to the ones in the training data, e.g., similar types of images. Existing dictionary learning algorithms often operate on individual data samples without taking their self- similarity into account in searching for the sparsest solutions [13]. Observing this, Mairal et al. [14] recently propose an algorithm exploiting the intuition that similar patches in an image should admit similar sparse representation over the dictionary. By enforcing group sparsity, their experimental results on image denoising and demosaicing demonstrate improvements over existing methods.

We propose a super-resolution method that exploits self-similarities and group structural constraints of image patches using only one single input frame. In con- trast to [10], our algorithm exploits patch self-similarity within the image and introduces the group sparsity for better regularization in the reconstruction pro- cess. Compared with [14], we exploit not only the patch similarity within scale but also across scales. In addition, we are the first to show structural sparsity can be successfully applied to the image super-resolution (which is not a trivial exten-

(3)

sion). Different from [12], we enforce constraints in constructing high-resolution image patches within an image pyramid, and exploit group sparsity and generate better super resolution images. Experimental results show the proposed method is able to achieve the state-of-the-art performance for image super resolution using one single frame.

2 Proposed Algorithm

We present the proposed algorithm in this section. Our approach exploits both patch similarity across scale and group structural constraint underlying the nat- ural images. In contrast to existing super-resolution algorithms that resort to a large data of disparate images, we show that the training patches generated directly from the input image itself facilitate finding more similar patches.

Our algorithm consists of two main steps in which we exploit self-similarities among image patches. We first generate high-resolution/low-resolution patch pairs from one single frame by exploiting self-similarities. To generate high- resolution/low-resolution patch pairs from one single frame, we create an image pyramid and build the patch pairs between corresponding high-resolution/low- resolution images. As shown in [12], the use of an image pyramid provides an effective method to generate a sufficient number of high-resolution patches from low-resolution ones.

After creating high-resolution/low-resolution patch pairs, we enforce the group sparsity constraints among similar patch pairs. The group sparsity constraints have been shown to be effective for image denoising and demosaicing [14]. In contrast to [14], we exploit not only the patch similarity within image scale but also across image scale. In addition, we show that structural sparsity can be successfully applied to the image super-resolution. We present the details of our algorithm in the following sections.

2.1 Exploiting Self-Similarities to Generate Example Pairs

In the first step, we generate a set of high-resolution/low-resolution patch pairs from one single input image. These generated patch pairs are used to construct the output high-resolution image in the second step. Conventionally, the source of image pairs for example-based algorithms can be extracted from an extrin- sic large dataset that encompasses a wide range of scenes or a category-specific one (e.g., [6, 10]). Alternatively, such image pairs can be extracted intrinsically from one single frame (e.g., [12]). The advantage of using extrinsic dataset is the availability of plentiful patch pairs, which may facilitate finding matches between high-resolution and low-resolution image patches. However, the draw- back is the ensuing problem with large image variation inherent among image pairs from diverse sources. Consequently these algorithms may find similar low- resolution patches from the dataset, but the paired high-resolution patches are not necessarily suitable for constructing high quality super-resolution images.

To avoid this problem, we generate patch pairs naturally bearing strong sim- ilarities directly from the input low-resolution image itself. Motivated by the observations of [12], we build image patch pairs from an image pyramid to pro- vide highly similar patch pairs.

(4)

Assume the relationship between high-resolution image Ih, and low-resolution image Ilis

Il= (Ih∗ B) ↓s, (1)

where ∗ is a convolution operator, B is an isotropic Gaussian kernel, and ↓s is a subsampling operator with scaling factor s. From an input image I0 shown in Fig. 1, we first generate low-resolution images Ik (k = −1, . . . , −n). By well controlling the scaling factors and the variance parameters of the Gaussian ker- nels, it is possible to create high-resolution patches by exploiting self-similarity among the input image and generated low-resolution images. Fig. 1 illustrates the concept, and Proposition 1 states the relationship between scaling factors and the corresponding Gaussian variance parameters.

Fig. 1. Exploiting cross-scale patch redundancy in an image pyramid: I0 is the input image. I−1and I−2are downsampled layers from I0. The pixels of I10 and I20 are copied and enlarged from image patches of I0. For a source patch Ps in I0, several similar patches (P1 and P2) can be found in lower-resolution images (I−1 or I−2). For each found patch (P1 or P2), a corresponding region (R1 or R2) in I0 are determined.

Similarly, a corresponding region (D1 or D2) are determined by two factors: (1) the region of source patch Ps, (2) the layer index of the found patch (-1 of I−1 or -2 of I−2). Finally, the intensity value of R1 are copied to D1 with enlarged area, so as R2

to D2.

Proposition 1. For any two downsampled images Ip = (I0∗ Bp) ↓sp and Iq = (I0∗ Bq) ↓sq of the image pyramid, the variances of their Gaussian kernels are related by σp2= σ2q· log(sp)/ log(sq).

The proof of this proposition is presented in Appendix 1. We assume the input image I0 is a downsampled result from an unknown high-resolution image Ik

(k ≥ 1), so that we can exploit patch similarity across scales to fill regions in Ik. We set sk = sk/n(k = −1, . . . , −n) where s is the expected scaling factor for final output image and n is the number of low-resolution images. This exponential setting is critical because our goal is to create high-resolution/low-resolution patch pairs for second part. Only with this setting described in Proposition 1, the Gaussian kernel variances between Ik to Ik−n are the same as In to I0.

For a source patch Psin the input image I0, we use the approximate nearest neighbor algorithm [15] to find most similar patches in low-resolution images.

(5)

Assume two patches are found, i.e., P1 and P2 in Fig. 1, their corresponding regions (R1 and R2) in I0have larger size than P1and P2. Similarly any image patch Ps of I0 can be assumed to be generated by high-resolution images with Equation 1, and the corresponding regions in the high-resolution images are D1 and D2. The relationship between Pk to Rk should be similar as Ps to Dk, and thus we set Dk to have the same intensity as Rk. However, Ps is not completely the same as Pkand Rkis not completely the same as Dk. We compute their weights based on their similarity with exp(−kPs− Pkk22) to average the overlapped high-resolution patches, where σ controls the degree of similarity.

Denote the high-resolution images are I10 and I20 in Fig. 1, they contain many copied patches but may have some uncovered regions (i.e., some source patches in I0may not find similar patches in the image pyramid). We fill the uncovered area with the back projection algorithm [4] for improving image resolution. Because the blur kernels are known in our formulation, we generate high-resolution images by compensating low-resolution images

Ih= Ih00− (Il00− Il0) ↑s, (2) where Ih00 is an initial high-resolution image, Il00 is the image generated by Ih00in Equation 1, and Il0is the images where Dkis copied to. The upsampling operator

swe use here is bicubic interpolation. If Il0has uncovered areas, we ignore these regions and set their pixel values to to zero. We generate the initial In00 with bicubic interpolation of I0, and compensate In0 to I0. We summarize the first step to generate high-resolution/low-resolution image pairs in Algorithm 1.

2.2 Exploiting Group Self-Similarities to Construct High-Resolution Images

The method presented in Section 2.1 can generate a high-resolution image H, but the resulting image may contain significant amount of noise. In this section we propose a method to further refine it by exploiting the group sparsity constraints among image patches. As the high-resolution image H and low-resolution image L are known, and the width of the Gaussian kernel σ is also known, we can generate several high-resolution images from H by the downsampling process described in Equation 1.

From the first step, we have n + 1 pairs of images between Ik and Ik−n

(k = 0, . . . , n). We form image pairs that every low-resolution patch in Ik−nhas a corresponding high-resolution patch in Ik whose scaling factor is s. We use all the patch pairs to learn a dictionary with their group sparsity in order to capture the relationship among all the high-resolution or low-resolution patches, respectively.

In order to train this dictionary, we first extract features from low-resolution patches and high-resolution patches similar to [10]. The features we extract from low-resolution patch are two first-order image gradients and two second-order image gradients along horizontal and vertical axes, i.e. [1, 0, −1], [1, 0, −1]>, [−1, 0, 2, 0, −1], [−1, 0, 2, 0, −1]>. For each high-resolution patch, each feature vector is formed by raster scan of pixel values after subtracting the mean value of that patch.

(6)

Algorithm 1: Construct high-resolution images from single input frame Data: Input image L, Zooming factor z, Gaussian kernel variance σ62, Number

of similar patches m, Similarity weight parameter σw, Back-projection loop number lb

Result: High-resolution images I1to In (n is decided by z) Set I0 = L with resolution (h0, w0);

for k = 1, . . . , 6 do

Set scaling factor s−k= (1/1.25)k;

Compute convoluted image C−kby convolving I0 with a Gaussian kernel whose variance σ−k2 = σ62∗ log(k)/ log(6);

Set h−k= h0∗ s−k, and w−k= w0∗ s−k(possibly non-integer);

Compute image I−k by subsampling C−kto the resolution (h−k,w−k);

end

for k = 0, . . . , 5 do

for each 5×5 patch Ps in I−kdo

Compute the corresponding region Rs in C−(k+1)(boundary coordinates of Rs are usually non-integer);

Compute Qs by subsampling Rsinto a 4 × 4 patch;

Save patch pair (Qs, Ps) into patch pair database B;

end end

Compute number of upsampling image n=roundup(log(z)/ log(1.25));

for k = 1, . . . , n do

Compute image Ik’s resolution as (h0× (1.25)k, w0× (1.25)k) ; for each 5×5 region in Ikdo

Compute the corresponding region Rq in Ik−1 (boundary coordinates of Rq are usually non-integer);

Compute query patch Qq by subsampling Rqinto a 4 × 4 patch;

Query Qq in database B to find similar patches Q1∼ Qmwith paired 5 × 5 patches P1∼ Pmand difference value dt= kQq− Qtk2; for t = 1, . . . , m do

Compute patch weight wm= exp(−dtw);

Record each patch P and weight w;

end end

Compute average image A by weighted average overlapped patches {P } and weights {w};

Set scaling factor sk= 1.25k;

Compute Gaussian kernel whose variance σk2= σ26∗ log(k)/ log(6);

Set the initial value of back-projected image Y as A;

for t = 1, . . . , lbdo

Compute back-projected image Y respect to I0 with Gaussian projection kernel (variance = σ2k), downscale and upscale factor sk, back-projection kernel the same as projection kernel;

end Set Ik = Y ;

Add patch pairs (Q,P ) to B from image pairs Ik−1and Ik as above;

end

(7)

For each high-resolution/low-resolution patch pair, we compose one concate- nated feature vector. As the dimensions of low-resolution patch feature and high- resolution patch feature are different, we normalize both feature vectors inde- pendently in order to balance their contributions, before concatenating them into one single vector. All of the concatenated feature vectors are normalized to unit-norm vectors for dictionary learning with group sparsity constraints. Due to the feature design, it is possible that both of the high-resolution feature vector and low-resolution feature vector are zero. In such cases, these feature vectors are discarded.

To exploit the group similarity among patch pairs, we group pairs with similar feature vectors into clusters by K-means clustering. The feature we choose is the image gradient generated by low-resolution patches regardless of high-resolution patches because the low-resolution patches are more reliable than high-resolution patches.

With a given dictionary D, we solve the group sparse coefficients for each cluster Ui as

min

Ai

kAik1,2 s.t. kYi− DAikF ≤√

niδ, (3)

where kAk1,2 = Pn

k=1kRkk2 and Rk is A’s k-th row. In the equation above, Yi is the column-wise feature vector in cluster Ui, ni is the column number of Yi, k · kF is the Frobenius norm, and δ is a threshold controlling how similar the reconstructed feature vectors should be constructed from the original feature vectors. We use the SPGL1 package [16] to solve the above optimization problem.

As the group sparse coefficients are solved within separated cluster and the dictionary is given before solving the above equation, we need to update the dictionary for overall optimization. We denote A as the union of all coefficients Ai, and Y as the union of all feature vectors Yi. The dictionary D is updated by the K-SVD algorithm [13],

D = arg min

D kY − DAkF s.t. kDjk2= 1 ∀ j, (4) where Dj is the j-th column of D. We iteratively solve group sparse coefficients in Equation 3 and Equation 4 until both A and D converge. The product of dictionary D and coefficient A contains the resulting feature vectors by patch similarity not only within each cluster but also among all clusters. We use these feature vectors to generate the output high-resolution image. We summarize the process of this step in Algorithm 2.

3 Experimental Results

In this section, we describe the experimental setups and present the results using the proposed method and other algorithms. For all the experiments, we set the number of support low-resolution image n = 6, the number of nearest neighbor m = 9, variance of Gaussian blur kernel σ2 = 0.8, scaling factor s

= 3, and group sparse coding threshold δ = 0.05. For a color input image, we convert it to YCbCr space and apply our algorithm only on luma component Y, and simply bicubic interpolate chroma components CbCr since human eyes are much more sensitive to luma rather than chroma. To compare with the state- of-the-art example-based algorithms, we use the original code provided by [10],

(8)

Algorithm 2: Refine image through group sparse coding

Data: Image Pyramid {Ik} k = −6, . . . , n , Zooming factor z, Gaussian kernel variance σ26, Low-resolution patch size m, Cluster number c, Group sparsity threshold δ, Dictionary size d, Dictionary update loop number K Result: Refined high-resolution image H

for k=0, . . . , 6 do

Denote low-resolution image Lk= I−k;

Compute expected scaling factor s = 1.25−k∗ z and index t =roundup(log(s)/ log(1.25));

Denote upsampled image Is= It; Set σ2= σ62∗ 6 ∗ log(1.25)/ log(s);

Compute Icby convolving Is with a Gaussian kernel whose variance is σ2; Set expected resolution (hh, wh) = (s ∗ h0, s ∗ w0) where (h0, w0) is I0’s resolution;

Compute Hkby subsampling Icto resolution (hh, wh) ; for each m × m patch Pil on Lkdo

Set patch Pih= the corresponding mz × mz patch of Pilon Hk; Compute high-resolution feature vector fih,r= Pih− mean(Pih);

Compute low-resolution feature vector fil,r with gradient vectors Pil; Normalize feature vector fih,r to fih,nand record the norm value vhi; Normalize feature vector fil,r to fil,n;

Concatenate vectors fih,nand fil,nto single vector fic;

Normalize vector ficto vector yi, and save fic’s norm value vci; end

end

Cluster all {fil,r} by K-means clustering to get c clustering sets {Uj}, j = 1 . . . c, from vector set. Each Ujcontains several indexes of similar fl,r;

Denote Y as all vectors {yi} and set initial dictionary D0 = first d non-repeated yi vectors;

for k=1 , . . . , K do

For every cluster Uj, find the coefficient set Aj by Equation 3;

Denote Ak as all coefficient sets {Aj} j = 1, . . . , c and compute residual rk= kY − Dk−1AkkF;

for each m × m patch Pil on l0 do

Reconstruct yir= D · ai, where ai is yi’s coefficients in Aj; De-normalized yid= yri· vic;

Reconstruct normalized high-resolution feature vector fih,r= de-concatenatehigh(ydi);

Reconstruct de-normalized feature vector fih,d= fih,r· vhi;

Reconstruct high-resolution intensity patch Pih,r= fih,d+ mean(Pih) where Pihis Pil’s corresponding mz × mz patch on Hk;

end

Compute Hk= average of overlapped Pih,r ; Update dictionary Dk from Dk−1 by Equation 4;

end

Set H = Hk, where k = arg min{rk} ;

(9)

and implement the algorithm of [12]1. More results and MATLAB code can be found on http://eng.ucmerced.edu/people/cyang35.

We use images in the Berkeley segmentation dataset [17] for experiments.

As shown in Fig. 2-7, the proposed algorithm generates shaper images with less artifacts than the ones obtained by the example-based super-resolution algo- rithm [10]. Due to space limitation, we cannot present the full resolution images in this manuscript and these images are best viewed on high-resolution displays (additional results with high resolution images can be found in the supplemen- tary material). For example, the super-resolution images generated by [10] have more artifacts along vertical strips or regions with intensity discontinuity, e.g., the horse legs in Fig. 2, the swimmer’s cap in Fig. 4, the gentleman’s collar in Fig 5, and the stripes in Fig. 7. In addition, the proposed algorithm outperforms the conventional super-resolution algorithm using bicubic interpolation. The re- sults can be explained by the assumption of example-based super-resolution algorithm which entails the need to find matches between low-resolution and high-resolution image pairs from a large training set. However, this assumption does not always hold when the training set contains disparate images which are not directly relevant to the test image (i.e., the trade-off between generality and specialty). In contrast, our algorithm does not have this problem because the training set is constructed directly from the input frame rather than a fixed dictionary.

Compared with the results generated by [12], the super-resolution images by our method also have fewer artifacts, e.g., along antlers of the deer in Fig. 3 and facial regions around eyes and mouth in Fig. 6. The success of [12] depends on whether there are plentiful similar patches in the image pyramid generated by the input frame. For images with numerous repetitive patterns (e.g., sunflower fields or butterfly wings), this algorithm tends to work well. This algorithm is not expected to perform well for an image containing a unique object, e.g., a human standing in a natural scene as shown in Fig. 6. As this unique object occupies a relatively small region, this algorithm is not able to find a sufficient number of similar patches in the natural image using the low-resolution patches from the unique object (e.g., faces), and consequently produce improper high-resolution patches (i.e., generate super-resolution image patches of foreign objects). The resulting effects are especially noticeable as these unique objects are usually the focus of attention in these images. Our proposed algorithm does not have such artifacts because we exploit both of group similarity and patch similarity rather than mere patch similarity in [12]. Although the patches on human faces are few, they can be included in similar groups to maintain the similarity in the dictionary learning. Consequently, they produce much fewer artifacts in the super-resolution images.

1 This is based on our best efforts to implement the algorithm by Glasner et al. [12]

with their help and suggestions as the authors do not release their code. The results may not be exactly the same as their reported results due to parameter settings.

(10)

(a) Bicubic (b) Yang et al. [10] (c) Proposed Fig. 2. Horse (results best viewed on a high-resolution display) Our result shows sharper edge than bicubic interpolation and less artifacts than [10] along the fence and the front legs.

4 Concluding Remarks

In this paper we propose an example-based super-resolution algorithm by ex- ploiting self-similarities using one single input image. We exploit self-similarities on two fronts: both in generating image pairs and learning dictionary with group sparsity. Experimental results show our algorithm is able to achieve the state-of- the-art super-resolution images. Our future work will focus on algorithms that take the geometrical relationships among image patches into account for efficient and effective dictionary learning.

Acknowledgments

We would like to thank Daniel Glasner and Oded Shahar for numerous discus- sions regarding implementation details of their super-resolution algorithm.

Appendix

Proof of Proposition 1: Assume s2 = sn1, where n is a natural number and the subsample operator ↓ does not decrease image quality, then Iin∗ B2is equivalent to ((((Iin∗ B1) ↓s1) ∗ B1) ↓s1 · · · ∗ B1) ↓s1.

Also assuming the subsample operator can be ignored, it implies Iin∗ B2= ((Iin∗ B1) ∗ B1) · · · ∗ B1 n times. Using the associative law of a convolution operator in the discrete domain, i.e., (f ∗ g) ∗ h = f ∗ (g ∗ h), it follows Iin∗ B2= Iin∗ (B1∗ · · · ∗ B1), and B2= B1∗ · · · ∗ B1n times.

Because we use Gaussian blur kernel and the convolution of two Gaussian kernels is still a Gaussian kernel whose variance is the sum of the two variances, i.e., σ22= n · σ12as B2= B1∗ · · · ∗ B1. With these equation together, σ22= n · σ21 and s2= sn1, it follows that σ12= σ22· log(s1)/ log(s2) ut.

(11)

(a) Glasner et al. [12] (b) Yang et al. [10] (c) Proposed Fig. 3. Deer (results best viewed on a high-resolution display). Compared with result generated [12], our super-resolution image has fewer artifacts (e.g., the antler region is smoother).Compared with result generated by [10], our super-resolution image has fewer artifacts (e.g., the antler region).

(a) Glasner et al. [12] (b) Yang et al. [10] (c) Proposed Fig. 4. Swimmer (results best viewed on a high-resolution display). Compared with result generated by [12], our result has fewer artifacts (e.g., muscle and rib regions).

Compared with result generated by [10], our result has fewer artifacts (e.g., around the head region).

(a) Glasner et al. [12] (b) Yang et al. [10] (c) Proposed Fig. 5. Gentleman (results best viewed on a high-resolution display). Compared with result generated by [12], our result has less artifacts (e.g., on the forehead). Compared with result generated by [10], our result has less artifacts (e.g., on the collar region).

(12)

(a) Original (b) Proposed

(c) Yang et al. [10] (d) Glasner et al. [12]

Fig. 6. Boy (results best viewed on a high-resolution display). Compared with result generated by [12], our super-resolution image has fewer artifacts (e.g., several blotches in the facial and collar regions). Compared with result generated by [10], our super- resolution image has fewer artifacts (e.g., several large blotches in the lip and contour regions).

(a) Bicubic (b) Yang et al. [10] (c) Proposed Fig. 7. Young Man (results best viewed on a high-resolution display). Our result shows sharper edge than bicubic interpolation and less artifacts than [10] along the collar and the stripes.

(13)

References

1. Park, S.C., Park, M.K., Kang, M.G.: Super-resolution image reconstruction: A technical overview. IEEE Signal Processing Magazine (2003) 21–36

2. Morse, B., Schwartzwald, D.: Image magnification using level set reconstruction.

In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

(2001) 333–341

3. Fattal, R.: Image upsampling via imposed edge statistics. In: SIGGRAPH ’07:

ACM SIGGRAPH 2007 papers, ACM (2007)

4. Irani, M., Peleg, S.: Improving resolution by image registration. Computer Vision, Graphics and Image Processing 53 (1991) 231–239

5. Lin, Z., Shum, H.Y.: Fundamental limits of reconstruction-based superresolution algorithms under local translation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (2004) 83–97

6. Freeman, W.T., Jones, T.R., Pasztor, E.C.: Example-based super-resolution. IEEE Computer Graphics and Applications (2002) 56–65

7. Sun, J., Zheng, N.N., Tao, H., Shum, H.Y.: Image hallucination with primal sketch priors. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Volume 2. (2003) 729–736

8. Chang, H., Yeung, D.Y., Xiong, Y.: Super-resolution through neighbor embedding.

Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2004) 275–282

9. Sun, J., Sun, J., Xu, Z., Shum, H.Y.: Image super-resolution using gradient pro- file prior. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. (2008)

10. Yang, J., Wright, J., Huang, T., Ma, Y.: Image super-resolution via sparse rep- resentation of raw image patches. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2008)

11. Xiong, X., Sun, X., Wu, F.: Image hallucination with feature enhancement. In:

Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

(2009)

12. Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009) 349–356 13. Aharon, M., Elad, M., Bruckstein, A.: K-SVD: An algorithm for designing over- complete dictionaries for sparse representation. IEEE Transactions on Signal Pro- cessing 54 (2006) 4311–4322

14. Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Non-local sparse models for image restoration. Proceedings of IEEE International Conference on Computer Vision (2009) 2272–2279

15. Arya, S., Mount, D.M.: Approximate nearest neighbor queries in fixed dimen- sions. In: SODA ’93: Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms. (1993) 271–280

16. Berg, E.v., Friedlander, M.P.: SPGL1: A solver for large-scale sparse reconstruction (2007) http://www.cs.ubc.ca/labs/scl/spgl1.

17. Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of IEEE International Conference on Com- puter Vision. (2001) 416–423

參考文獻

相關文件

To do this, we propose the use of a state-of-the-art frame-semantic parser, and a spectral clustering based slot ranking model that adapts the generic output of the parser to the

Furthermore, super resolution reconstruction combines images into a sharper image even if the result image of Best Shot Selector is blurred.. The experiment results show that

For each time-step an image view (in which all feature estimates are projected into the estimated camera frame) is shown next to an external 3D view in which the camera can also

a single instruction.. Thus, the operand can be modified before it can be modified before it is used. Useful for fast multipliation and dealing p g with lists, table and other

With the proposed model equations, accurate results can be obtained on a mapped grid using a standard method, such as the high-resolution wave- propagation algorithm for a

32.Start from the definition of derivative, then find the tangent line equation.. 46.Find the product of their slopes at there

Full credit if they got (a) wrong but found correct q and integrated correctly using their answer.. Algebra mistakes -1% each, integral mistakes

The molal-freezing-point-depression constant (Kf) for ethanol is 1.99 °C/m. The density of the resulting solution is 0.974 g/mL.. 21) Which one of the following graphs shows the