• 沒有找到結果。

Extensions and applications

ˆ

RNu 2ϕ

∂xi∂xj dx.

We thus conclude that

sup

ϕ

ˆ

RNu 2ϕ

∂xi∂xj dx

 ≤ C,

which, arguing as in the previous section, says that u ∈ BV2(RN), in particular that D2u M(RN,RN ×N) and

|D2u|(RN)≤ ΓL1(RN)- lim inf

n→∞ Fn(u) for every u∈ L1(RN).

For the upper bound we observe that if u∈ Cc2(RN), we have by the uniform convergence of Theorem 1.4 and the fact that u is sufficiently smooth with compact support that

n→∞lim Fn(u) =|D2u|(RN).

Then choosing un= u we conclude that ΓL1(RN)- lim sup

n→∞

Fn(u)≤ lim

n→∞Fn(u)

=|D2u|(RN).

Now, taking the lower semicontinuous envelope with respect to L1(RN) strong convergence, using that both the ΓL1(RN)- lim sup and the mapping u→ |D2u|(RN) are lower semicontinu-ous on L1(RN) (for the Γ- lim sup see [DM93, Prop. 6.8]), we deduce that

ΓL1(RN)- lim sup

n→∞

Fn(u)≤ scL1(RN)|D2u|(RN)

=|D2u|(RN) for all u∈ L1(RN).

4. Extensions and applications.

4.1. An asymmetric extension. In the previous sections we have shown that our nonlocal definition of Hn as in (1.11) localizes to the classical distributional Hessian for a specific choice of the weights ρn and thus can be rightfully called a nonlocal Hessian. In numerical applications, however, the strength of such nonlocal models lies in the fact that the weights can be chosen to have nonlocal interactions and model specific patterns in the data. A classic example is the nonlocal total variation [GO08]:

(4.1) JN L−T V(u) =

ˆ

Ω

ˆ

Ω|u(x) − u(y)|

w(x, y)dydx.

A possible choice is to choose w(x, y) large if the patches (neighborhoods) around x and y are similar with respect to a patch distance da, such as a weighted 2 norm, and small if they

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

are not. In [GO08] this is achieved by setting w(x, y) = 1 if the neighborhood around y is one of the K ∈ N closest to the neighborhood around x in a search window, and w(x, y) = 0 otherwise. In effect, if the image contains a repeating pattern with a defect that is small enough to not throw off the patch distances too much, it will be repaired as long as most similar patterns do not show the defect.

Computing suitable weights is much less obvious in the case of H. We can formally extend (1.11) using arbitrary pairwise weights ρ :Rn× Rn→ R,

(4.2) Hρu(x) = C ˆ

RN

u(x + z)− 2u(x) + u(x − z)

|z|2

zzN +2|z|2 IN

|z|2 ρx(z)dz,

and use it to create nonlocal generalizations of functionals such as TV2, for example to mini-mize the nonlocal L2-TV2 model

(4.3) f (u) :=

ˆ

RN|u − g|2dx + α ˆ

RN|Hρ|dx.

However, apart from being formulated onRN instead of Ω, formulation (4.2) has an important drawback compared to the first-order formulation (4.1): while the weights are defined between two points x and y, the left part of the integrand uses the values of u not only at x and y but also at the “mirrored” point x + (x− y). In fact we can replace the weighting function by the symmetrized version 12x(y− x) + ρx(x− y)}, which in effect relates three points instead of two and limits the choice of possible weighting functions.

In this section we therefore introduce a more versatile extension of (1.11) that allows for full nonsymmetric weights. We start with the realization that the finite-difference integrand in (4.2) effectively comes from canceling the first-order differences in the Taylor expansion of u around x, which couples the values of u at x, y, and x + (x− y) into one term. Instead, we can avoid this coupling by directly defining the nonlocal gradient Gu(x) and Hessian looking for a nonlocal gradient Gu(x) ∈ RN and Hessian Hu(x) ∈ Sym(RN ×N) that best explain u around x in terms of a quadratic model, i.e., that take the place of the gradient and Hessian in the Taylor expansion:

(4.4)

(Gu(x), Hu(x)) := argmin

Gu∈RN,Hu∈Sym(RN×N)

1 2

ˆ

Ω−{x}

u(x + z)− u(x) − Guz−1

2zHuz 2

σx(z)dz.

Here the variable x + z takes the place of y in (1.11). We denote definition (4.4) the implicit nonlocal Hessian, as opposed to the explicit formulation (4.2).

The advantage is that any terms involving σx(z) are now only based on the two values of u at x and y = x + z, and (in particular bounded) domains other than RN are naturally dealt with, which is important for a numerical implementation. We also note that this ap-proach allows us to incorporate nonlocal first-order terms as a side-effect, and can be naturally extended to third- and higher-order derivatives, which we leave to further work.

With respect to implementation, the implicit model (4.4) does not add much to the overall difficulty: it is enough to add the nonlocal gradient and Hessian Gu(x) ∈ RN and Hu(x) ∈ RN ×N as additional variables to the problem and couple them to u by adding the

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

optimality conditions for (4.4) to the problem. Since (4.4) is a finite-dimensional quadratic minimization problem, the optimality conditions are linear, i.e., of the form Au,xHu(x) = bu,x. Such linear constraints can be readily added to most solvers capable of solving the local prob-lem. Alternatively the matrices Au,x can be inverted explicitly in a precomputation step;

however, we did not find this to increase overall performance.

While the implicit and explicit models look different at first glance, from the considerations about the Taylor expansion we expect them to be closely related, and in fact this is true, as shown in the following theorem.

Theorem 4.1. Let ρ be a radially symmetric function satisfying the conditions (1.4)–(1.6), let u∈ BV2(RN), with ∇u being Lipschitz, let x ∈ RN, and let

Then the optimal Hu(x) is given by the explicit nonlocal Hessian, i.e.,

Hu(x) = Hu(x) = N (N + 2)

This means that by setting Ω =RN, assuming some extra regularity for u, and substituting the weights σx(z) with ρ(z)/|z|4 the implicit nonlocal Hessian coincides with the explicit one.

In order to prove Theorem4.1we need first the following lemma, whose only difference is that we are integrating over RN\ B(0, ) in order to deal with the introduced singularity 1/|z|4 at the origin.

Lemma 4.2. Let ρ be a radially symmetric function satisfying the conditions (1.4)–(1.6), let u∈ BV2(RN), let x∈ RN, and let

Then the optimal H is given by Hij = C

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Proof. The optimality conditions for (4.6) in terms of G and H are

under the constraint H ∈ Sym(RN ×N). Note that from a nonsymmetric solution H0 we can always construct a symmetric solution H ∈ Sym(RN ×N) by letting H = 12(H0+ H0), as the equations are invariant under transposition of H. The above conditions correspond to the following sets of N and N2 equations, respectively:

ˆ can be rewritten as a linear system with an m× m block matrix,

A V The entries in the submatrices V are all of the form

ˆ

RN\B(0,)zizjzkρ(z)

|z|4dz, i, j, k∈ {1, . . . , N}.

No matter what the choice of i, j, k is, there is always one index with an odd power, so every one of these integrals is zero due to symmetry and the fact that ρ is even, i.e., ρ(z) = ρ(−z).

This means that the conditions on the gradient and the Hessian parts of p decouple, i.e., the problem is

(4.9) ApG= a, BpH = b.

We can therefore look at the isolated problem of computing the Hessian part pH, or equiva-lently H, without interference from the gradient part. The matrix B is of the form

1 2

ˆ

RN\B(0,)zizjzi0zj0ρ(z)

|z|4 dz.

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Again due to symmetry, the only way that this integral is nonzero is if there are no odd powers, so either all indices are the same or there are two pairs:

1

The right-hand side vector b in (4.9) is bij =

We consider the different cases for i0, j0 separately.

Case i0 = j0. All the terms in the left-hand side of (4.10) vanish except for (i, j) =

Then we are done for this case by simply observing that C−1:= 2 where the last equality follows from (3.5).

Case i0 = j0. All the terms in the left-hand side of (4.10) vanish apart for i = j; i.e., we

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

where E is the all-ones matrix. Thus

D:= M−1−−→→0 N (N + 2) 2

I− 1 N + 2E

, and the lemma has been shown.

Remark 4.3. Note that we did not have to explicitly compute the optimal gradient G in the lemma above. However, we can bound G uniformly in . For that we assume that ∇u is Lipschitz, say with constant L. Indeed, the gradient part in (4.9) reads as

Gi ˆ

RN\B(0,)

z2i

|z|4ρ(z)dz = ˆ

RN\B(0,)(u(x + z)− u(x)) zi

|z|4ρ(z)dz, i = 1, . . . , N.

By symmetry we have ˆ

RN

zi2

|z|4ρn(z) dz = 1 N

ˆ

RN

1

|z|2ρn(z) dz;

therefore from the first equation we conclude the bound

|Gi| ˆ

RN\B(0,)

ρ(z)

|z|2 dz = N



 ˆ

RN\B(0,)

u(x + z)− u(x)

|z|

zi

|z|

ρ(z)

|z|2dz





≤ LN ˆ

RN\B(0,)

ρ(z)

|z|2dz

⇒ |Gi| ≤ LN.

We are now ready to prove Theorem 4.1, which is based on Lemma4.2, Remark4.3, and a Γ-convergence argument.

Proof of Theorem 4.1. Let F and Fbe the functions mapping a point x to the minimizers of (4.5) and (4.6). From Fatou’s lemma the functionals F are lower semicontinuous. Also

F→ F, increasingly, pointwise,

and thus from [DM93, Rem. 5.5] we have that the functionals F Γ-converge to F as → 0.

Finally, from the fact that u ∈ BV2(RN) and Remark4.3 the minimizers of F are bounded uniformly in . Then [DM93, Thm. 7.4], we have that F attains its minimum and

argmin F = lim

→0argmin F= Hu(x), where the argmin here refers to the Hessian part of the minimizers.

Note that while Theorem 4.1requires radial symmetry of ρ, this symmetry was generally assumed throughout section 3. Therefore section 3can also be seen as providing localization results for implicit models of the form (4.5).

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Figure 2. Adaptive choice of the neighborhood for the image of a disc with constant slope and added Gaussian noise. Left: A standard discretization of the second-order derivatives at the marked point uses all points in a 3× 3 neighborhood. Center: With a suitable choice of the weights σx, the discretization of the (nonlocal) Hessian at the marked point only (or mostly) involves points that are likely to belong to the same affine region, here the inside of the disc. This allows one to use a straightforward second-order regularizer such as the norm of the HessianHu, while preserving jumps. Right: Geodesic distance dMfrom the marked point x in the lower right. Points that are separated from x by a strong edge have a large geodesic distance to x and are therefore not included in the neighborhood of x that is used to define the nonlocal Hessian at x.

4.2. Choosing the weights for jump preservation. A characteristic of nonlocal models is that they are extremely flexible due to the many degrees of freedom in choosing the weights.

In this work we will focus on improving on the question of how to reconstruct images that are piecewise quadratic but may have jumps. The issue here is that one wants to keep the Hessian sparse in order to favor piecewise affine functions, but doing it in a straightforward way, such as by adding |D2u|(Ω) as a regularizer, enforces too much first-order regularity [Sch98,LLT03,LT06,HS06].

There have been several attempts to overcome this issue, most notably approaches based on combined first- and higher-order functionals (see [PS14] and the references therein), infimal convolution [CL97], and TGV [BKP10, SST11]. Here we propose another strategy, making use of the nonlocal formulation (4.4). Note that for general fixed weights ρ, even in the implicit model, finiteness of ´

Ω|Hρ|dx does not require existence of |D2u|, as Theorem 1.7 only concerns (specific) sequences of ρ.

We draw our motivation for choosing the weights partly from a recent discussion of nonlocal

“amoeba” filters [LDM07, WBV11, Wel12]. Amoeba filters use classical techniques such as iterated median filtering, but the structuring element is chosen in a highly adaptive local way that can follow the image structures, instead of being restricted to a small parametrized set of shapes. In the following we propose extending this idea to the higher-order energy minimization framework (Figure 2).

Given a noisy image g : Ω → R, we first compute its (equally noisy) gradient image ∇g (in all of this section we consider only the discretized problem, so we can assume that the gradient exists). We then define the Riemannian manifoldM on the points of Ω with the usual Riemannian metric, weighted by ϕ(∇g) for some function ϕ : RN → R+. In our experiments we used

ϕ(∇g) := |∇g|2+ γ (4.11)

for small γ > 0, but other choices are equally possible. The choice of γ controls how strongly

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

the edge information is taken into account: for γ = 0 the weights are fully anisotropic, while for large γ the magnitude of the gradient becomes irrelevant, and the method will reduce to an isotropic regularization. With these definitions, the geodesic distance dM between two points x, y ∈ Ω now has a powerful interpretation: If dM(x, y) is large, this indicates that x and y are separated by a strong edge and should therefore not appear in the same regularization term. On the other hand, small dM(x) indicates that x and y are part of a more homogeneous region, and it is reasonable to assume that they should be part of the same affine part of the reconstructed image.

For a given point x ∈ Ω, we sort the neighbors y1, y2, . . . in order of ascending distance, i.e., dM(x, y1)≤ dM(x, y2)≤ · · · . We choose a neighborhood size M ∈ N and set the weights to

σx(y) :=

1, i≤ M, 0, i > M.

(4.12)

In other words, the nonlocal Hessian at x is computed using its M closest neighbors with respect to the geodesic distance through the gradient image.

The geodesic distances σx(y) can be efficiently computed using the fast marching method [Set99,OF03] by solving the Eikonal equation

|∇c(y)| = ϕ(∇g(y)), (4.13)

c(x) = 0 (4.14)

and setting dM(x, y) = c(x). Although it is necessary to process this step for every point x ∈ Ω, it is in practice a relatively cheap operation: the fast marching method visits the neighbors of x in the order of ascending distance dM, which means it can be stopped after M points, with M usually between 5 and 20. If M is chosen too small, one risks that the linear equation system that defines the nonlocal Hessian in (4.4) becomes underdetermined. In our experiments we found M = 12 to be a good compromise, but the choice does not appear to be a very critical one. In the experiments we used the L1-nonlocal TV2 model

u:Ω→R,min

Gu:Ω→RN, Hu:Ω→Sym(RN×N)



x∈Ω

|u(x) − g(x)|pdx + α

x∈Ω

ω(x)|Hu(x)|dx (4.15)

s.t. A

Gu Hu

= Bu, (4.16)

where α > 0 is the regularization strength, and p∈ {1, 2}. The linear constraints implement the optimality conditions for Gu and Hu from (4.4), similar to (4.7)–(4.8):



y∈Ω, z=y−x

(u(y)− u(x) − Gu(x)z−1

2zHu(x)z)z σx(z) = 0, x∈ Ω, (4.17)



y∈Ω, z=y−x

(u(y)− u(x) − Gu(x)z−1

2zHu(x)z)zzσx(z) = 0, x∈ Ω, (4.18)

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

input TV TV2 TGV

Figure 3. Classical local regularization. The input consists of a disc-shaped slope with additive Gaussian noise, σ = 0.25. Shown is the result of denoising the input with an L1-data term. Total variation (TV, α = 1.1) regularization generates the well-known staircasing effect. Regularization of the Hessian (TV2, α = 0.8) avoids this problem, at the cost of oversmoothing jumps. Total generalized variation (TGV, α0 = 1.5, α1 = 1.1) performs best but still clips the slope at the top.

with a suitable parametrization of Hu to enforce symmetry. Note that unlike the radially symmetric case in Lemma4.2, the conditions do not necessarily uncouple; therefore the first-order component Gu needs to be included in the optimization as well.

The local weight ω is set as ω(x) = M/|{y ∈ Ω|By,x = 0}|. While the approach does work with uniform weights ω = 1, we found that in some cases it can erroneously leave single outlier points intact. We believe that this is caused by a subtle issue: by construction of σ, outlier points are usually close neighbors to fewer points. Therefore they appear in fewer of the regularization terms |Hu(x)|, which effectively decreases regularization strength at outliers.

The local weight ω counteracts this imbalance by dividing by the total number of terms in which a particular value u(x) appears.

4.3. Numerical results. All experiments were performed on an Intel Xeon E5-2630 at 2.3 GHz with 64GB of RAM, MATLAB R2014a running on Scientific Linux 6.3, GCC 4.4.6, and Mosek 7. Run times were between several seconds for the geometric examples to several minutes for the full-size images, the majority of which was spent at the solution stage. The computation of the geodesic distances dM using the fast marching method only took a few milliseconds in all cases, and the total preprocessing time including building the sparse matrix structures A and B took less than 5 seconds for the full-size images.

The solution of the Eikonal equation and computation of the weights as well as the system matrix use a custom C++ implementation. For solving the minimization problems we used the commercial interior-point based Mosek solver with the CVX interface. This allows us to efficiently compute solutions with very high accuracy and therefore to evaluate the model without the risk of accidentally comparing only approximate solutions. The stopping criterium was set to guarantee that the normalized difference between the objective values f of the numerical solution and f of the true minimizer satisfies (f − f)/f 

= 1.5· 10−8. Alternatively, nonsmooth first-order methods could be used; however, in our experience they become prohibitively slow for a precision beyond 10−4 due to the sublinear convergence rate.

Figure 3 illustrates the effect of several classical local regularizers, including TV, TV2, and TGV. As expected, TV generates the well-known staircasing effect, while TV2 leads to oversmoothing of the jumps. TGV with hand-tuned parameters performs reasonably well;

however, it exhibits a characteristic pattern of clipping sharp local extrema. This behavior has also been analytically confirmed in [PB13,PS13] for the one-dimensional case.

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

input TGV nonlocal Hessian (proposed)

Figure 4. Nonlocal regularization of the problem in Figure3. The adaptive choice of the neighborhood and weights together with the nonlocal Hessian preserves the jumps, clips the top of the slope, and allows one to perfectly reconstruct the piecewise affine signal.

Figure 4 shows the effect of our nonlocal Hessian-based method for the same problem.

The piecewise affine signal is almost perfectly recovered. This is mostly due to the fact that the jumps are relatively large, which means that after computing the neighborhoods the circular region and the background are not coupled in terms of regularization. Therefore the regularization weight α can be chosen very large, which results in virtually affine regions.

To see what happens with smaller jumps, we generated a pattern of opposing slopes (Figure 5). As expected, both TGV and the nonlocal Hessian approach struggle when the jump is small. This shows the limitations of our approach for choosing the weights—while it adds some structural information to the regularizer, this information is still restricted to a certain neighborhood of each point and does not take into account the full global structure.

Figures 6–8 show a quantitative comparison of our nonlocal method with the results of the TGV approach and several classical regularizers. The parameters for each method were chosen by a grid search to yield the best PSNR. For all images we also provide a more realistic

“structural similarity index” (SSIM) [WBSS04]. The nonlocal approach improves on TGV with respect to PSNR (34.09 vs. 33.28). However, it is interesting to look at the spatial distribution of the error: the parts where the nonlocal approach improves are exactly the local maxima, which are smoothed over by TGV. Surprisingly, this is hardly visible when looking at the images only (Figure 6), which is in line with the good results obtained with TGV for natural images. However, this property might become a problem when the data is not a natural image, for example in the case of depth images or digital elevation maps. We refer the reader to [LBL13] for a discussion of a problem that is more demanding in terms of the correct choice of the regularization.

Finally, Figure 9 shows an application to the “cameraman” image with L2 data term.

Small details on the camera as well as long, thin structures such as the camera handle and the highlights on the tripod are well preserved. In larger, more homogeneous areas the result is what would be expected from second-order smoothness.

In its basic form, our approach is data-driven; i.e., the weights are computed directly

Downloaded 11/25/15 to 137.189.170.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

input TGV low TGV high nonlocal Hessian

Figure 5. Denoising results for the “opposing slopes” image. The small jump at the crossing causes a slight amount of smoothing for both the TGV and the nonlocal approaches. TGV with low regularization strength (TGV low, α0= α1= 0.8) does reasonably well at preserving the strong jump on the right but does not

Figure 5. Denoising results for the “opposing slopes” image. The small jump at the crossing causes a slight amount of smoothing for both the TGV and the nonlocal approaches. TGV with low regularization strength (TGV low, α0= α1= 0.8) does reasonably well at preserving the strong jump on the right but does not

相關文件