Calcolo, vol. 56, no. 4, Article 42, 26 pages, December, 2019

### Signal reconstruction by conjugate gradient algorithm based on smoothing l

_{1}

### -norm

Caiying Wu ^{1}

College of Mathematics Science Inner Mongolia University

Hohhot 010021, China
Jiaming Zhan ^{2}

College of Mathematics Science Inner Mongolia University

Hohhot 010021, China
Yue Lu ^{3}

School of Mathematical Sciences Tianjin Normal University

Tianjin 300387, China
Jein-Shan Chen ^{4}
Department of Mathematics
National Taiwan Normal University

Taipei 11677, Taiwan January 12, 2019

(1st revision on May 21, 2019) (2nd revision on July 29, 2019) (3rd revision on August 18, 2019)

1E-mail: wucaiyingyun@163.com. The research is supported by the Natural Science Foundation of

Inner Mongolia Autonomous Region (2018MS01016)

2E-mail: zhanjiaming1220@qq.com. The research is supported by the Natural Science Foundation of

Inner Mongolia Autonomous Region (2018MS01016)

3E-mail: jinjin403@sina.com. The research is supported by National Natural Science Foundation of

China (Grant Number: 11601389), Doctoral Foundation of Tianjin Normal University (Grant Number:

52XB1513), 2017-Outstanding Young Innovation Team Cultivation Program of Tianjin Normal Uni- versity (Grant Number: 135202TD1703) and Program for Innovative Research Team in Universities of Tianjin (Grant Number: TD13-5078).

4Corresponding author. E-mail: jschen@math.ntnu.edu.tw. The research is supported by Ministry

of Science and Technology, Taiwan.

Abstract. The l_{1}-norm regularized minimization problem is a non-differentiable problem
and has a wide range of applications in the field of compressive sensing. Many approaches
have been proposed in the literature. Among them, smoothing l_{1}-norm is one of the effec-
tive approaches. This paper follows this path, in which we adopt six smoothing functions
to approximate the l1-norm. Then, we recast the signal recovery problem as a smoothing
penalized least squares optimization problem, and apply the nonlinear conjugate gradient
method to solve the smoothing model. The algorithm is shown globally convergent. In
addition, the simulation results not only suggest some nice smoothing functions, but also
show that the proposed algorithm is competitive in view of relative error.

Keywords. l1-norm regularization, compressive sensing, conjugate gradient algorithm, smoothing function.

Mathematics Subject Classification (2000) 90C33

### 1 Introduction

In this paper, we focus on the topic of reconstructing signals, which is one of the most important applications in compressive sensing [8, 9, 16]. There exist numerous textbooks and articles related to this topic and there is no need to repeat its importance and applications here. Therefore, we get into its mathematical model directly and convey our new idea for tackling this problem. Mathematically, the noise-free signal reconstruction problem model is described as follows:

min kxk_{0}

s.t. b = Ax, (1)

where x ∈ IR^{n}is the original sparse signal that needs to be recovered, A ∈ IR^{m×n}(m n)
is the measurement matrix, b ∈ IR^{m} is the observation vector, and kxk_{0} represents the
l_{0}-norm of x, which is defined as the number of nonzero components of x. Without loss
of generality, we assume that there exists a positive constant K ≤ n such that kxk_{0} = K.

Besides this condition, if the measurement matrix A satisfies Restricted Isometry Prop- erty (RIP) of order K, then we can recover the signal x more accurately through the model (1), see [6, 7, 8, 9, 10] for more details.

Unfortunately, the l_{0}-norm minimization problem (1) is an NP-hard combinatorial
optimization problem. In order to avoid this difficulty, researchers attempt to replace
l0-norm by l1-norm in model (1) and obtain the following l1 minimization problem

min kxk_{1}

s.t. b = Ax, (2)

where kxk_{1} denotes the l_{1} norm of x and kxk_{1} = Pn

i=1|x_{i}|. Under the RIP condition,
the model (2) has the same solution as (1). In practice, the probed signal b is usually
impacted by noise, therefore there arises investigation on the noise signal reconstruction
problem:

min kxk_{1}

s.t. b = Ax + e, (3)

where e ∈ IR^{m} denotes the noise. In order to deal with (3), researchers prefer to consider
the associated penalized least squares optimization problem

min λkxk1+ 1

2kb − Axk^{2}_{2} (4)

where λ > 0 is the penalty factor. In the sequel, we call (4) the l_{1}-norm regularized
problem.

Until now, there are plenty of numerical algorithms for solving the model (4) and some first-order algorithms have drawn much attention during the past decades. Due the huge amount of literature, we only outline and recall some approaches as below. The gradient projection algorithm [18] is one of the earliest gradient-based algorithms for solving (4), in which it was reformulated as a box constrained quadratic program and solved by the gradient projection algorithm. Recently, the most extensively investigated first order algorithm for the solution of (4) was the iterative shrinkage/thresholding (IST) algorithm [15, 21]. Their results triggered off many contributions based on this method [17, 24, 25]. In light of the IST algorithm, many variants are proposed under different optimization reformulations and techniques. For instances, Hale, Yin and Zhang [19, 20]

presented an IST fixed point continuation (FPC) method by an operator splitting skill.

Wright, Nowak and Figueiredo [29] studied the spaRSA for sparse reconstruction from
solving nonsmooth convex problem. Experimental results show that the accelerated IST
methods (two IST [4] and fast IST [1]) have better convergence properties. In addition,
the famous NESTA [2] first proposed the smoothing function of the l_{1}-norm and then
used Nesterov’s gradient method to get the solution of (3). Besides these methods, the
gradient projection technique and alternating direction ideas are also considered, see
[3, 5, 30] for more details.

Because the simplicity and lower storage requirements, conjugate gradient algorithms
are suitable for large scale problems. In this paper, like [28, 31, 33], we are interested
in applying the conjugate gradient method for signal recovery. We use six smoothing
functions, which are studied in [23, 27] for absolute value function, to approximate the l_{1}-
norm. Accordingly, we reformulate the signal recovery problem as a smoothing penalized
least squares optimization problem, and apply the nonlinear conjugate gradient method
[32] to solve the smoothing model. The proposed algorithm is shown globally convergent.

Moreover, we report some numerical experiments to demonstrate the effectiveness of our method. Numerical comparisons with “NESTA”, “FPCBB”, and “FISTA”, which are well-known open softwares, are presented as well.

### 2 Preliminary

This section recalls some ideas about smoothing technique which can be found in [11, 12, 23] and references therein.

It is well known that the absolute value function |t| (t ∈ IR) is not smooth at zero.

In order to overcome this difficulty, we introduce some smoothing functions of |t| used in the sequel.

Definition 2.1. The function ψ : IR_{++}× IR → IR is a smoothing function of |t|, if the
following two conditions hold:

(a) ψ is continuously differentiable at (µ, t) ∈ IR_{++}× IR;

(b) lim_{µ↓0}ψ(µ, t) = |t| for any t ∈ IR.

How to construct smoothing functions for |t|? First, we observe that |t| can be divided into two parts:

|t| = (t)_{+}− (t)−= (t)_{+}+ (−t)_{+},

where (t)_{+} denotes the plus function, i.e., (t)_{+} = max{0, t}, and (t)_{−} = min{0, t}. As
mentioned in [11, 12, 23], one can follow the below procedure to construct a smoothing
function for (t)_{+}. More specifically, through importing a density (kernel) function d(t)
with finite number of pieces satisfying

d(t) ≥ 0 and

Z +∞

−∞

d(t)dt = 1, one can define

ˆ

s(t, µ) := 1 µd t

µ

,

where µ is a positive parameter. If the following condition holds Z +∞

−∞

|t| d(t) dt < +∞, then the function ˆp(t, µ) defined as

ˆ

p(t, µ) = Z +∞

−∞

(t − s)_{+}s(s, µ)ds =ˆ
Z t

−∞

(t − s)ˆs(s, µ)ds ≈ (t)_{+}

is a smoothing approximation for (t)_{+}. There are existing well-known smoothing func-
tions for the plus function [11, 22, 23], for example,

ψˆ1(µ, t) = t + µ ln

1 + e^{−}^{µ}^{t}

, (5)

ψˆ_{2}(µ, t) =

t if t ≥ ^{µ}_{2},

1

2µ t + ^{µ}_{2}2

if −^{µ}_{2} < t < ^{µ}_{2},
0 if t ≤ −^{µ}_{2},

(6)

ψˆ_{3}(µ, t) = p4µ^{2}+ t^{2}+ t

2 , (7)

ψˆ_{4}(µ, t) =

t − ^{µ}_{2} if t > µ,

t^{2}

2µ if 0 ≤ t ≤ µ, 0 if t < 0.

(8)

where the corresponding kernel functions are respectively given by
d_{1}(t) = e^{−x}

(1 + e^{−x})^{2},

d_{2}(t) = 1 if − ^{1}_{2} ≤ x ≤ ^{1}_{2},
0 otherwise,
d_{3}(t) = 2

(x^{2}+ 4)^{3}^{2},

d_{4}(t) = 1 if 0 ≤ x ≤ 1,
0 otherwise.

Likewise, we can achieve the smoothing function of |t| via convolution as follows:

ˆ

p(|t| , µ) = ˆp(t, µ) + ˆp(−t, µ) = Z +∞

−∞

|t − s| ˆs(s, µ)ds.

Analogous to (5)-(8), we construct four smoothing functions for |t| as [23]:

ψ_{1}(µ, t) = µh
ln

1 + e^{−}^{µ}^{t}

+ ln

1 + e^{µ}^{t}

i

, (9)

ψ_{2}(µ, t) =

t if t ≥ ^{µ}_{2},

t^{2}

µ +^{µ}_{4} if − ^{µ}_{2} < t < ^{µ}_{2},

−t if t ≤ −^{µ}_{2},

(10)

ψ_{3}(µ, t) =p

4µ^{2}+ t^{2}, (11)

ψ_{4}(µ, t) =
( t^{2}

2µ if |t| ≤ µ,

|t| − ^{µ}_{2} if |t| > µ. (12)

In particular, if we take a Epanechnikov kernel function d(t) =

_{3}

4(1 − t^{2}) if |t| ≤ 1,
0 if otherwise,
then the associated smoothing function for |t| is expressed by

ψ5(µ, t) =

t if t > µ,

−_{8µ}^{t}^{4}3 +^{3t}_{4µ}^{2} + ^{3µ}_{8} if − µ ≤ t ≤ µ,

−t if t < −µ.

(13)

Moreover, taking a Gaussian kernel function d(t) = ^{√}^{1}

2πe^{−}^{t2}^{2} for all t ∈ IR yields
ˆ

s(t, µ) := 1 µd t

µ

= 1

p2πµ^{2}e^{−}^{2µ2}^{t2} ,
which leads to another type of smoothing function [27] for |t|:

ψ_{6}(µ, t) = terf

t

√2µ

+

r2

πµe^{−}^{2µ2}^{t2} , (14)

where the error function is defined as follows:

erf(t) = 2

√π Z t

0

e^{−u}^{2}du ∀t ∈ IR.

In summary, we have obtained six smoothing functions and will employ them to approx-
imate the l_{1}-norm for signal recovery problem.

### 3 Algorithm and Convergence Properties

This section is devoted to the detailed description and implementation of our algorithmic
idea and its global convergence results. Basically, it is a type of conjugate gradient
algorithm. To proceed, we briefly review the conjugate gradient method. Suppose that
f : IR^{n}→ IR is a continuously differentiable function and the target problem is

min

x∈IR^{n}f (x).

Let d^{k} denote the search direction and αk be a step length. Then, the general framework
of conjugate gradient method is as below.

x^{k+1} = x^{k}+ α_{k}d^{k},

d^{k}= −∇f (x^{k}) if k = 0,

−∇f (x^{k}) + β_{k}d^{k−1} if k > 0, (15)

where ∇f is the gradient of f and β_{k} is a parameter. In general, various choices of
parameters β_{k} represent (correspond to) different conjugate gradient algorithms. Among
them, the three-term PRP (Polak-Ribiere-Polyak) conjugate gradient method is the most
efficient, where

d^{k} = −∇f (x^{k}) if k = 0,

−∇f (x^{k}) + β_{k}d^{k−1}− θ_{k}y^{k−1} if k > 0, (16)
y^{k−1} = ∇f (x^{k}) − ∇f (x^{k−1}),

β_{k} = (∇f (x^{k}))^{T}y^{k−1}
k∇f (x^{k−1})k^{2} ,
θ_{k} = (∇f (x^{k}))^{T}d^{k−1}

k∇f (x^{k−1})k^{2} .

The utmost feature of this conjugate gradient method is to ensure the sufficient descent property of direction at each iteration, which plays a prominent role in the global con- vergence analysis. For more details, please refer to [32] and references therein.

For convenience, we denote

f_{µ}(x) := λφ_{i}(µ, x) +1

2kb − Axk^{2}_{2}, x ∈ IR^{n}, (17)
where φi(µ, x) is a smoothing function of l1 norm kxk1. In other words, it corresponds
to the format as below:

φ_{i}(µ, x) :=

n

X

j=1

ψ_{i}(µ, x_{j}), i = 1, 2, 3, 4, 5, 6. (18)

Thus, the problem (4) is equivalent to the smoothing penalized least squares optimization problem:

x∈IRmin^{n}f_{µ}(x) (19)

which is our new target problem. To deal with the unconstrained minimization (19), the following iterative scheme is employed, see [32].

The Algorithm

Step 0. Given starting point x^{0} ∈ IR^{n} and µ_{0} > 0. Choose parameters ρ, δ, γ, and

¯

γ ∈ (0, 1). Let d^{0} = −∇f_{µ}_{0}(x^{0}). Set k = 0.

Step 1. If termination criterion is reached, then stop.

Step 2. Determine α_{k} = max{ρ^{j}, j = 0, 1, 2, . . .} satisfying

f_{µ}_{k}(x^{k}+ α_{k}d^{k}) ≤ f_{µ}_{k}(x^{k}) − 2δ(1 − γ)α_{k}f_{µ}_{k}(x^{k}).

Step 3. Let x^{k+1} = x^{k}+ α_{k}d^{k} and µ_{k+1} = ¯γµ_{k}. Replace f with f_{µ}_{k+1} in formula (16)
and compute d^{k+1} by (16).

Step 4. Set k = k + 1 and return to step 1.

We also say a few words about how to update µ_{k} in Step 3. For Experiment 1, µ_{k}
is updated at each iteration by µ_{k+1} = ¯γµ_{k}, ¯γ ∈ (0, 1) and we select ¯γ = 0.4. For
Experiment 2 and Experiment 3, an inner loop is used to solve our relaxed problem for
µ_{k}, and then reduces µ_{k}. This is indeed a technique appeared in NESTA. When µ_{k} is
small enough, our algorithm no longer updates the parameter µ_{k}.

Lemma 3.1. Suppose that the function f_{µ}(x) is defined by (17). Then, there exists a
constant L > 0 such that k∇f_{µ}(x) − ∇f_{µ}(y)k ≤ Lkx − yk for all x, y ∈ IR^{n}.

Proof. For any fixed µ > 0, in order to prove the Lipschitz property of ∇fµ(x), we need
to verify the Lipschitz condition of ψ_{i}^{0} (i = 1, 2, 3, 4, 5, 6). To this end, we discuss two
cases.

Case (i): i = 1, 3, 5, 6. For any t_{1}, t_{2} ∈ IR, without losing of generality, let t_{1} < t_{2}, by
Lagrange Mean Value Theorem, we have

ψ_{i}^{0}(µ, t_{1}) − ψ^{0}_{i}(µ, t_{2})
=

ψ^{00}_{i}(µ, ξ)

|t_{1}− t_{2}| , ξ ∈ (t_{1}, t_{2}).

For subsequent analysis, we need to estimate

ψ_{i}^{00}(µ, ξ)

for each i = 1, 3, 5, 6.

For i = 1, we know that

|ψ_{1}^{00}(µ, ξ)| = 1
µ

"

e^{ξ}^{µ}
(1 + e^{µ}^{ξ})^{2}

+ e

−ξ µ

(1 + e^{−ξ}^{µ} )^{2}

#

< 2 µ. For i = 3, it is clear that

ψ_{3}^{00}(µ, ξ)

= _{(4µ}2^{4µ}+ξ^{2}^{2})^{3/2} < _{2µ}^{1} .
For i = 5, we have

ψ_{5}^{0}(µ, t) =

1 if t > µ,

−_{2µ}^{t}^{3}3 +_{2µ}^{3t} if −µ ≤ t ≤ µ,

−1 if t < −µ.

ψ^{00}_{5}(µ, t) =

0 if t > µ,

−_{2µ}^{3t}^{2}3 + _{2µ}^{3} if −µ ≤ t ≤ µ,
0 if t < −µ.

which yields

ψ_{5}^{00}(µ, ξ)

< 3µ^{2}
2µ^{3} + 3

2µ = 3 µ. For i = 6, we compute

ψ^{0}_{6}(µ, t) = 2

√π
Z ^{√}^{t}

2µ

0

e^{−u}^{2}du, ψ^{00}_{6}(µ, t) =

√2

√πµe

−t2 2µ2,

which imply

ψ_{6}^{00}(µ, ξ)
<

√2

√πµ.

All the aforementioned results indicate that

ψ_{i}^{0}(µ, t_{1}) − ψ_{i}^{0}(µ, t_{2})
≤ 3

µ|t_{1} − t_{2}|, i = 1, 3, 5, 6. (20)
for any t1, t2 ∈ IR.

Case (ii): i = 2, 4. Indeed, we will provide a version like (20) for ψ_{2}^{0} and ψ_{4}^{0}.
For i = 2, we know that

ψ_{2}^{0}(µ, t) =

1 if t ≥ ^{µ}_{2},

2t

µ if −^{µ}_{2} < t < ^{µ}_{2},

−1 if t ≤ −^{µ}_{2}.
If t_{1} ≥ ^{µ}_{2}, t_{2} ≥ ^{µ}_{2}, t_{1} ≤ −^{µ}_{2}, t_{2} ≤ −^{µ}_{2} or t_{1}, t_{2} ∈ (−^{µ}_{2},^{µ}_{2}), then

ψ_{2}^{0}(µ, t_{1}) − ψ_{2}^{0}(µ, t_{2})
≤ 2

µ|t_{1}− t_{2}|.

If t_{1} ≥ ^{µ}_{2}, t_{2} ≤ −^{µ}_{2}, then

ψ^{0}_{2}(µ, t_{1}) − ψ^{0}_{2}(µ, t_{2})

= 2 = µ2 µ ≤ 2

µ|t_{1}− t_{2}|.

If t_{1} ≥ ^{µ}_{2}, t_{2} ∈ (−^{µ}_{2},^{µ}_{2}), then

ψ^{0}_{2}(µ, t_{1}) − ψ^{0}_{2}(µ, t_{2})

= 1 − 2t2

µ < 2t1

µ −2t2

µ = 2

µ|t_{1}− t_{2}|.

If t_{1} ≤ −^{µ}_{2}, t_{2} ∈ (−^{µ}_{2},^{µ}_{2}), then

ψ_{2}^{0}(µ, t_{1}) − ψ^{0}_{2}(µ, t_{2})

= 1 + 2t_{2}

µ < −2t_{1}
µ +2t_{2}

µ = 2

µ|t_{1}− t_{2}|.

Thus, it is clear to conclude that

ψ_{2}^{0}(µ, t_{1}) − ψ_{2}^{0}(µ, t_{2})
≤ 2

µ|t_{1}− t_{2}|, ∀ t_{1}, t_{2} ∈ IR.

For i = 4, using the similar arguments for case of i = 2, it can be verified that

ψ_{4}^{0}(µ, t_{1}) − ψ_{4}^{0}(µ, t_{2})
≤ 1

µ|t_{1}− t_{2}|, ∀t_{1}, t_{2} ∈ IR.

In summary, we also achieve

ψ^{0}_{i}(µ, t_{1}) − ψ^{0}_{i}(µ, t_{2})
≤ 2

µ|t_{1}− t_{2}|, i = 2, 4. (21)

for any t_{1}, t_{2} ∈ IR.

Now, applying (20) and (21), for any x, y ∈ IR^{n}, we have
k∇f_{µ}(x) − ∇f_{µ}(y)k

=

λ [∇φ_{i}(µ, x) − ∇φ_{i}(µ, y)] + A^{T}(Ax − b) − A^{T}(Ay − b)

≤ 3

µλnkx − yk + kAk^{2}kx − yk

= Lkx − yk,

where L = _{µ}^{3}λn + kAk^{2}. Thus, the proof is complete. 2

Lemma 3.2. For µ > 0, the level set L(x^{0}) = {x ∈ IR^{n}| f_{µ}(x) ≤ f_{µ}(x^{0})} is bounded.

Proof. We prove it by contradiction. Suppose that the set L(x^{0}) is unbounded. Then,
there exists an index set K_{1}, such that kx^{k}k → ∞, k → ∞, k ∈ K_{1}. Recalling the
definition of fµ(x), we have fµ(x^{k}) → ∞, k → ∞, k ∈ K1. This contradicts fµ(x^{k}) ≤
f_{µ}(x^{0}). Thus, the level set L(x^{0}) is bounded. 2

In fact, since the continuity of f_{µ}(x), we find the level set L(x^{0}) is compact. We point
out that Lemma 3.1 and Lemma 3.2 play key roles in our theoretical part. They were
assumed as two assumptions in [32] and other literature like [14]. Here we assert them
by showing that function (17) based on the proposed smoothing functions satisfies these
assumptions. With these properties, we are ready to present the global convergence of
our algorithm.

Theorem 3.1. For any µ > 0, consider the aforementioned algorithm with any starting
point x^{0}. Let {x^{k}} be the sequence generated by the algorithm. Then

lim inf

k→∞ k∇f_{µ}(x^{k})k = 0. (22)

Proof. Suppose that the conclusion (22) is not true. Then, there exists a constant
ε_{0} > 0, such that

k∇fµ(x^{k})k ≥ ε0, ∀k. (23)

Since (∇f_{µ}(x^{k}))^{T}d^{k}= −k∇f_{µ}(x^{k})k^{2}, there exists α_{k}> 0, such that

f_{µ}(x^{k}+ α_{k}d^{k}) ≤ f_{µ}(x^{k}) − 2δ(1 − γ)α_{k}f_{µ}(x^{k}). (24)
This means {f_{µ}(x^{k})} is decreasing and bounded, which implies that x^{k} ∈ L(x^{0}) and
{fµ(x^{k})} is convergent. We denote that limk→∞fµ(x^{k}) = f∗. From (24), it can be

verified that lim_{k→∞}α_{k}f_{µ}(x^{k}) = 0. By the definition of f_{µ}(x), we know f∗ > 0, and
therefore 0 < α_{k} = ^{α}^{k}_{f}^{f}^{µ}^{(x}^{k}^{)}

µ(x^{k}) ≤ ^{α}^{k}^{f}_{f}^{µ}^{(x}^{k}^{)}

∗ . To sum up, we obtain

k→∞lim αk = 0. (25)

Now, applying lemma 3.2, there is a constant r > 0, such that

k∇f_{µ}(x^{k})k ≤ r, ∀ k. (26)

Next, we prove {kd^{k}k} is bounded by a contradiction argument. If {kd^{k}k} is un-
bounded, then there exists an index K_{2}, such that kd^{k}k → ∞, k → ∞, k ∈ K_{2}. Let θ_{k}
be the angle between −∇fµ(x^{k}) and d^{k}. Then, we see that

cos θ_{k} = −(∇f_{µ}(x^{k}))^{T}d^{k}

k∇f_{µ}(x^{k})kkd^{k}k = k∇f_{µ}(x^{k})k

kd^{k}k . (27)

This relation enables us to apply ε_{0} ≤ k∇f_{µ}(x^{k})k ≤ r to conclude cos θ_{k} → 0, k ∈ K_{2},
k → ∞, which means θ_{k}→ ^{π}_{2}, k ∈ K_{2}, k → ∞. Considering this geometric relationship,
we have (∇f_{µ}(x^{k}))^{T}d^{k} → 0, k ∈ K_{2}, k → ∞, from which we find

−k∇f_{µ}(x^{k})k^{2} = (∇f_{µ}(x^{k}))^{T}d^{k} → 0, k ∈ K_{2}, k → ∞,

which contradicts (23). Thus, {kd^{k}k} is bounded, i.e., there exists a constant M^{∗} > ε0,
such that

kd^{k}k ≤ M^{∗}, ∀k. (28)

Then, combining (25) together with (28) gives lim

k→∞α_{k}kd^{k}k = 0. (29)

In addition, from (27), we have cos θ_{k}≥ _{M}^{ε}^{0}∗, which further yields

−(∇f_{µ}(x^{k}))^{T}d^{k}= k∇f_{µ}(x^{k})kkd^{k}k cos θ_{k}≥ ε^{2}_{0}

M^{∗}kd^{k}k. (30)

Applying the mean value theorem, we obtain
f_{µ}(x^{k}+ α_{k}d^{k}) = f_{µ}(x^{k}) + α_{k}(∇f_{µ}(ξ^{k}))^{T}d^{k}

= f_{µ}(x^{k}) + α_{k}(∇f_{µ}(x^{k}))^{T}d^{k}+ α_{k}(∇f_{µ}(ξ^{k}) − ∇f_{µ}(x^{k}))^{T}d^{k}

≤ f_{µ}(x^{k}) + α_{k}kd^{k}k(∇f_{µ}(x^{k}))^{T}d^{k}

kd^{k}k + k∇f_{µ}(ξ^{k}) − ∇f_{µ}(x^{k})k

, (31)
where ξ^{k} is between x^{k} and x^{k} + α_{k}d^{k}. Using Lemma 3.1 and the compact property
of level set L(x^{0}) imply that ∇f_{µ}(x) is uniformly continuous on L(x^{0}). This together

and (29) implies that there exists a constant α > 0, for sufficiently large k, such thatb
α_{k}kd^{k}k <α andb

k∇f_{µ}(ξ^{k}) − ∇f_{µ}(x^{k})k < 1
2

ε^{2}_{0}

M^{∗}. (32)

Then, when k is sufficiently large, from (31), we have
fµ(x^{k}+ αkd^{k}) ≤ fµ(x^{k}) +αb

− ε^{2}_{0}
M^{∗} +1

2
ε^{2}_{0}
M^{∗}

= f_{µ}(x^{k}) − αεˆ ^{2}_{0}

2M^{∗}, (33)

which contradicts f_{µ}(x^{k+1}) − f_{µ}(x^{k}) → 0, k → ∞. Therefore, there must hold
lim inf

k→∞ k∇f_{µ}(x^{k})k = 0.

2

### 4 Numerical Experiments

In this section, we conduct numerical experiments to show the performance of our algo- rithm for signal recovery. All tests are run in MATLAB R20116 on a 64-bit PC with an Intel (R) Core(TM) i7-6500U of 2.50 GHz CPU and 8.00GB of RAM equipped with Windows 7 operating system.

Experiment 1

In what follows, we divide our tests into two groups: noise-free case and noise case. we
report the numerical results of our algorithm for signal recovery. In all of our simulations,
A ∈ IR^{m×n} is assumed to be Gaussian matrix and m = ^{n}_{2}, ^{n}_{4}, x^{∗} ∈ IR^{n} is a K-sparse orig-
inal signal with K = _{40}^{n}, whose nonzero component satisfies normal distribution N (0, 1).

Moreover, we set b = Ax + e in the noisy case, where e is the Gaussian noise with zero
mean and variance σ^{2}.

The parameters of our algorithm are summarized as follows:

x^{0} = zeros(n, 1), λ = 0.001 ∗ kA^{T}bk∞, ρ = 0.5, δ = 0.002, γ = 0.2, σ^{2} = 0.0001.

In Tables 1 and 2, we set µ_{0} = 0.1, µ_{k+1} = 0.4µ_{k} and the algorithm terminates when the
relative error

kx − x^{∗}k

kx^{∗}k < 4 × 10^{−3},

where x is the reconstruction signal. In Tables 3 and 4, F denotes the smoothing func-
tions, CPU denotes cpu time, Re denotes the relative error and It denotes the iterations,
we set µ = 1.0 e^{−2}, 1.0 e^{−3}, 1.0 e^{−4}, 1.0 e^{−5} and the algorithm terminates when

|f_{µ}(x^{k+1}) − f_{µ}(x^{k})|

|f_{µ}(x^{k+1})| < 1.0 e^{−12}.

Note that the result from each test is obtained by running our algorithm 10 times and taking its average result.

In our experiments, we found that ψ_{1} is very unstable, sometimes it is not applicable
to solve the test problems. Similar concern had also been addressed in [13, page 135].

Accordingly, we try to adopt the equivalent expression suggested in [13, formula (3.1)]

and rewrite ψ_{1}(t) as below:

ψ_{1}(t) = µh
ln

e^{λ1(A(t))}^{µ} + e^{λ2(A(t))}^{µ}
+ ln

e^{λ1(B(t))}^{µ} + e^{λ2(B(t))}^{µ} i

= λ_{1}(A(t)) + µ

eλ2(A(t))−λ1(A(t)) µ

+ λ_{1}(B(t)) + µ

eλ2(B(t))−λ1(B(t)) µ

where

A(t) = 0 0 0 −t

, B(t) = 0 0 0 t

,

λ_{1}(A(t)) = 0, λ_{2}(A(t)) = −t, λ_{1}(B(t)) = 0, λ_{2}(B(t)) = t.

Nonetheless, ψ_{1} still does not work well along with the algorithm. We therefore omit it
in our numerical reports.

CPU time Iterations

n m ψ_{2} ψ_{3} ψ_{4} ψ_{5} ψ_{6} ψ_{2} ψ_{3} ψ_{4} ψ_{5} ψ_{6}

2000 n/2 1.21 0.94 1.32 1.27 0.92 72 74 75 73 69

n/4 1.37 0.90 1.45 1.59 1.00 129 135 142 148 147

4000 n/2 4.16 3.83 5.01 4.50 3.80 70 70 78 72 68

n/4 4.68 4.12 5.19 5.58 4.50 131 143 143 148 146

6000 n/2 9.54 8.49 9.80 9.22 8.61 70 70 70 69 68

n/4 9.69 10.35 10.71 12.91 10.54 134 163 145 159 164 8000 n/2 16.93 16.25 18.76 18.91 15.55 67 71 74 78 69

n/4 15.76 15.65 18.65 19.49 15.75 126 137 145 144 138 10000 n/2 25.88 24.06 30.07 25.83 24.21 67 67 76 68 68

n/4 27.94 28.28 29.96 34.48 27.60 139 138 151 160 149 Table 1: Comparisons of five smoothing functions for noise-free case

CPU time Iterations

n m ψ_{2} ψ_{3} ψ_{4} ψ_{5} ψ_{6} ψ_{2} ψ_{3} ψ_{4} ψ_{5} ψ_{6}

2000 n/2 1.27 1.00 1.30 1.33 1.00 73 78 75 75 72

n/4 1.54 0.92 1.64 1.55 0.91 145 151 154 150 144

4000 n/2 4.48 4.47 4.89 4.62 4.36 72 80 75 72 76

n/4 4.94 5.16 5.77 5.09 4.65 137 159 154 143 158

6000 n/2 8.81 9.07 9.80 9.91 8.97 65 72 72 67 69

n/4 11.65 11.61 12.10 12.38 10.12 160 167 152 165 154 8000 n/2 16.27 17.05 18.28 18.53 16.69 69 75 76 74 71

n/4 15.76 18.04 19.27 17.99 16.91 132 155 145 145 143 10000 n/2 25.59 25.58 26.04 25.82 25.38 68 71 67 67 68

n/4 25.82 27.23 37.00 29.52 27.73 136 150 175 150 155 Table 2: Comparisons of five smoothing functions for noisy case

Figure 1: The convergence behavior: cpu time vs. signal size (n)when µ_{k+1} = 0.4µ_{k}

Figure 2: The convergence behavior: iterations vs. signal size (n) when µ_{k+1} = 0.4µ_{k}

µ = 1.0e^{−2} µ = 1.0e^{−3} µ = 1.0e^{−4} µ = 1.0e^{−5}

F m CPU Re It CPU Re It CPU Re It CPU Re It

ψ_{2} n/2 8.80 0.0158 92 9.34 0.0040 89 11.03 0.0033 100 16.78 0.0025 141
n/4 10.98 0.0365 172 11.49 0.0064 178 11.83 0.0029 184 17.06 0.0028 237
ψ_{3} n/2 9.44 0.0567 104 8.14 0.0090 86 7.98 0.0032 86 10.93 0.0025 111
n/4 7.44 0.1532 167 9.10 0.0195 179 8.36 0.0045 166 10.34 0.0030 188
ψ_{4} n/2 8.22 0.0538 71 7.92 0.0059 73 10.58 0.0027 93 12.01 0.0025 104
n/4 9.13 0.0157 139 11.11 0.0154 155 10.69 0.0035 179 13.43 0.0032 194
ψ5

n/2 7.88 0.0419 68 7.59 0.0055 72 10.14 0.0025 91 12.77 0.0025 109
n/4 10.26 0.1895 132 11.09 0.0117 156 10.51 0.0041 169 12.83 0.0034 193
ψ_{6} n/2 8.95 0.0360 96 7.94 0.0063 84 9.69 0.0027 99 12.57 0.0023 121
n/4 9.17 0.0926 166 9.16 0.0128 173 8.98 0.0038 174 11.75 0.0030 203
Table 3: Comparisons of five functions for different µ when n = 5000 and noise-free case

From the experiments, we have the following observations:

(1) The sparse original signal can be recovered by our algorithm effectively. Tables
1-4 and Figures 1-5 illustrate that the five smoothing functions (except ψ_{1}) in our
algorithm work quite well whether the given problem is noise-free or noisy.

(2) The convergence behavior with respect to CPU time, iterations, signal size, and the relative error are depicted in Figures 1-5, respectively. From Tables 1-2 and Figures 1-2, we see that our algorithm costs more cpu time when increasing the dimension n, while the changes of their iterations are very marginal. On the whole, the performance order can be summarized as below:

ψ_{2} ≥ ψ_{4} ≈ ψ_{5} ≈ ψ_{6} > ψ_{3}

where “≥” means performs better than and “≈” means there is no difference in numerical performance.

µ = 1.0e^{−2} µ = 1.0e^{−3} µ = 1.0e^{−4} µ = 1.0e^{−5}

F m CPU Re It CPU Re It CPU Re It CPU Re It

ψ_{2} n/2 8.62 0.0159 89 8.77 0.0072 88 11.82 0.0026 100 15.80 0.0025 143
n/4 11.67 0.0379 174 10.40 0.0063 180 10.26 0.0033 188 15.03 0.0033 242
ψ_{3} n/2 8.54 0.0548 100 7.86 0.0085 88 8.75 0.0033 92 10.38 0.0025 109
n/4 7.43 0.1506 169 8.43 0.0185 169 7.78 0.0045 177 9.82 0.0030 203
ψ_{4} n/2 6.76 0.0491 64 8.11 0.0058 76 11.40 0.0028 95 11.93 0.0026 105
n/4 11.10 0.1439 154 8.84 0.0137 152 9.67 0.0035 175 11.52 0.0035 184
ψ5

n/2 7.36 0.0542 65 7.82 0.0059 75 9.57 0.0031 87 11.33 0.0025 104
n/4 10.19 0.1492 143 10.00 0.0123 154 10.48 0.0039 177 11.37 0.0034 191
ψ_{6} n/2 8.54 0.0355 94 8.78 0.0061 92 9.05 0.0029 96 11.57 0.0026 119
n/4 7.77 0.0908 160 9.15 0.0122 180 8.11 0.0040 167 10.50 0.0030 207
Table 4: Comparisons of five smoothing functions for different µ when n = 5000 and

noisy case

Figure 3: The convergence behavior: cpu time vs. − log µ with n = 5000

Figure 4: The convergence behavior: iterations vs. − log µ with n = 5000

Figure 5: The convergence behavior: relative error vs. − log µ with n = 5000 (3) According to Tables 3-4 and Figures 3-5, the smaller the parameter µ, the better the

signal to recover; and the more cpu time and iterations have to spend accordingly.

For any fixed µ, although the function ψ_{2} sometimes takes a bit more cpu time
and iterations, it has a lower relative error. In view of this, we could conclude that
the function ψ2 works best along with the proposed algorithm. From Figure 5, the
function ψ_{6} also has a lower relative error, which means ψ_{6} is a possible good choice
as well. To sum up, in view of relative error, we have

ψ_{2} ≥ ψ_{4} ≈ ψ_{5} ≈ ψ_{6} > ψ_{3}.
Experiment 2

Next, we try to do comparison with “NESTA”, which is a well-known, fast, open software
proposed in [2]. In the noisy case, b = Ax + e, where e is the Gaussian noise with zero
mean and variance σ^{2}. In all of our simulations, A ∈ IR^{m×n} is assumed to be Gaussian
matrix, where n = 2^{13} and m = ^{n}_{4}. We select K-sparse original signal x^{∗} ∈ IR^{n} with
K = _{40}^{n}, its nonzero component satisfies normal distribution N (0, 1). In our test, we set

x^{0} = A^{T}b, λ = 0.0048 ∗ kA^{T}bk∞, ρ = 0.5, δ = 0.002, γ = 0.2, σ^{2} = 0.0001.

We also accelerate our algorithm by using
µ_{k+1}= ωµ_{k}, 1 ≤ k ≤ t, ω = (µ_{t}

µ_{0})^{1/t}, µ_{t} = 10^{−8}, µ_{0} = 10^{−2}, and t = 4.

The algorithm terminates when

|fµ(x^{k}) − ¯fµ(x^{k})|

f¯µ(x^{k}) < tol, f¯_{µ}(x^{k}) = 1
min{5, k}

min{5,k}

X

l=1

f_{µ}(x^{k−l}).

In our experiments, the accuracy tol ∈ {10^{−4}, 10^{−5}, 10^{−6}, 10^{−7}, 10^{−8}}. We test the con-
tinuation version NESTA for comparison. In testing NESTA, other parameters are taken
as default expect for the above parameters.

From the comparison experiments, we have the following observations:

(1) In the view of CPU time, the NESTA is really fast. However, from Tables 5-6 and Figure 8, as the “accuracy tol” gets smaller, the CPU time of our algorithm is not affected too much. To the contrast, when the “accuracy tol” gets smaller, the CPU time of NESTA goes up dramatically.

(2) From Tables 5-6 and Figure 7, we see that our algorithm has a small relative
error when the “accuracy tol” is relatively large. Especially, when tol = 10^{−4} and
tol = 10^{−5}, the relative error of NESTA gets large, which indicates that the signal
has not been recovered well. To the contrast, we see that the function ψ_{2} keeps a
small error, which means it has better performance.

(3) From Tables 5-6 and Figure 6, NESTA needs more iterations. In view of this and relative error, our proposed algorithm is competitive.

Figure 6: The convergence behavior: iterations vs. accuracy

Figure 7: The convergence behavior: relative error vs. accuracy

Figure 8: The convergence behavior: CPU time vs. accuracy

Experiment 3

In order to observe the efficiency on sparse recovery. We compare our algorithm with three
other algorithms: the NESTA, the FPCBB and the FISTA, in which Bernoulli matrix,
Partial Hadamard matrix and Gaussian matrix are chosen. In our implementations, we
select n = 2^{14}, m = ^{n}_{2} and sparsity is K = _{40}^{n}. The parameters µ_{k}, δ and γ are the same
as the Experiment 2. See Figures 9-11.

The result from each test is obtained by running our algorithm 30 times and taking its average result. Figures 9-11 show the frequency of successful reconstruction at different relative error. We can see that our algorithm gives relatively better behavior when relative error is small enough. This means that our algorithm has slightly higher efficiency on sparse recovery. To sum up, we obtain

ψ_{2} ≥ ψ_{4} ≈ ψ_{5} ≈ ψ_{6} > ψ_{3}.

Figure 9: Comparisons among algorithms with Bernoulli matrix

Figure 10: Comparisons among algorithms with Gaussian matrix

Figure 11: Comparisons among algorithms with Partial Hadamard matrix

### 5 Conclusions

In this paper, the sparse signal recovery problem is investigated, in which we concentrate
on the l1-norm model. In light of convolution techniques in [23, 27] and the framework
of nonlinear conjugate gradient method [32], we propose an unified smoothing approach
to solve the given sparse signal reconstruction problem. As a byproduct, the classical l_{1}
norm is approximated by six smoothing functions. Numerical results show that ψ_{2} and
ψ_{6} are the better choices of smoothing function to work with the algorithm. Although
the theoretical part is not very notable, the contribution of this paper lies in numerical
comparisons among new smoothing functions. In particular, we suggest two nice smooth-
ing functions to work along with nonlinear conjugate gradient method. In addition, we
compare our algorithm with three other algorithms (well-known open softwares): the
NESTA, the FPCBB and the FISTA, in which Bernoulli matrix, Partial Hadamard ma-
trix and Gaussian matrix are chosen. It can be seen that our algorithm gives relatively
better behavior when relative error is small enough. Under this sense, we provide a new
choice of simple and effective approach to deal with the recovery of the original sparse
signal problem.

Our attention was drawn to [14] by one reviewer. In [14], the authors proposed a smoothing conjugate gradient method, and then used this algorithm to solve the image restoration problem. More specifically, it used a smoothing function as below:

s_{µ}(t) =

( |t| if |t| > ^{µ}_{2},

t^{2}

µ + ^{µ}_{4} if |t| ≤ ^{µ}_{2}.

This smoothing function is exactly the ψ2(µ, t) function in our paper. Our contribution lies in showing that this smoothing function is also a good choice for signal reconstruction problem comparing to other smoothing functions, which are not investigated in reference [14].

Acknowledgements. The authors would like to thank the anonymous referee and the editor for their valuable comments, viewpoints, and suggestions, which help improve the manuscript a lot.

### References

[1] A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences, vol. 2, pp. 183-202, 2009.

[2] S. Becker, J. Bobin, and E. Candes, NESTA: a fast and accurate first-order method for sparse recovery, SIAM Journal on Imaging Sciences, vol. 4, pp. 1-39, 2011.

[3] E. Berg and M.P. Friedlander, Probing the Pareto frontier for basis pursuit solutions, SIAM Journal on Scientific Computing, vol. 31, pp. 890-912, 2008.

[4] J.M. Bioucas-Dias and M. Figueiredo, A new TwIST: two-step iterative shrink- age/thresholding algorithms for image restoration, IEEE Trans. Image Process, vol.

16, pp. 2992-3004, 2007.

[5] R.L. Broughton, I.D. Coope, P.F. Renaud, and R.E.H. Tappenden, A box constrained gradient projection algorithm for compressed sensing, Signal Process, vol.

91, pp. 1985-1992, 2011.

[6] E.J. Candes, The Restricted Isometry Property and its Implications for Compressed Sensing, Comptes Rendus Mathematique, vol. 346, no. 9-10, pp.589-592, 2008.

[7] E.J. Candes and T. Tao, Decoding by Linear Programming, IEEE Transactions on Information Theory, vol. 51, no. 12, pp.4203-4215, 2005.

[8] E.J. Candes and T. Tao, Near optimal signal recovery from random projections:

universal encoding strategies? IEEE Transactions on Information Theory, vol. 52, no.

12, pp. 5406-5425, 2006.

[9] E.J. Candes, J.K. Romberg, and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Transac- tions on Information Theory, vol. 52, no.2, pp. 489-509, 2006.

[10] E.J. Candes, J.K. Romberg, and T. Tao, Stable signal recovery from incom- plete and inaccurate measurements, Communications on Pure and Applied Mathe- matics, vol. 59, no. 8, pp. 1207-1223, 2006.

[11] C. Chen and O.L. Mangasarian, A class of smoothing functions for nonlinear and mixed complementarity problems, Computational Optimization and Applications, vol. 5, pp. 97-138, 1996.

[12] X. Chen, Smoothing methods for nonsmooth, nonconvex minimization, Mathemat- ical Programming, vol. 134, pp. 71-99, 2012.

[13] X. Chen, R.S. Womersley, and J. Ye, Minimizing the condition number of Gram matrix, SIAM Journal on Optimization, vol 21, no. 1, 127-148, 2011.

[14] X. Chen and W. Zhou, Smoothing nonlinear conjugate gradient method for im- age restoration using nonsmooth nonconvex minimization, SIAM Journal on Imaging Sciences, vol. 3, no. 4, pp. 765-790, 2010.

[15] C. De Mol and M. Defrise, A note on wavelet-based inversion algorithms, Con- temporary Mathematics, vol. 313, pp. 85-96, 2002.

[16] D.L. Donoho, Compressed sensing, IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289-1306, 2006.

[17] M. Elad, Why simple shrinkage is still relevant for redundant representations?

IEEE Transactions on Information Theory, vol. 52, no. 12, pp. 5559-5569, 2006.

[18] M. Figueiredo, R. Nowak, and S.J. Wright, Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems, IEEE Journal of Selected Topics in Signal Processing, vol. 1, pp. 586-597, 2007.

[19] E.T. Hale, W. Yin, and Y. Zhang Fixed-point continuation for l1-minimization:

methodology and convergence, SIAM Journal on Optimization, vol. 19, pp. 1107-1130, 2008.

[20] E.T. Hale, W. Yin, and Y. Zhang, Fixed-point continuation applied to com- pressed sensing: implementation and numerical experiments, Journal of Computa- tional Mathematics, vol. 28, pp. 170-194, 2010.

[21] R. Nowak and M. Figueiredo, Fast wavelet-based image deconvolution using the EM algorithm, in Proceedings of the 35th Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 371-375, 2001.

[22] L. Qi and D. Sun, Smoothing functions and smoothing Newton method for comple- mentarity and variational inequality problems, Journal of Optimization Theory and Applications, vol. 113, pp. 121–147, 2001.

[23] B. Saheya, C.H. Yu, and J.-S. Chen, Numerical comparisons based on four smoothing functions for absolute value equation, Journal of Applied Mathematics and Computing, vol. 56, no. 1-2, pp. 131-149, 2018.

[24] J.L. Starck, E. Candes, and D. Donoho, Astronomical image representation by the curvelet transform, Astronomy and Astrophysics, vol. 398, pp. 785-800, 2003.

[25] J.L. Starck, M. Nguyen, and F. Murtagh, Wavelets and curvelets for image deconvolution: a combined approach, Signal Process., vol. 83, pp. 2279-2283, 2003.

[26] R. Tibshirani Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, vol. 58, pp. 267-268, 1996.

[27] S. Voronin, G. Ozkaya, and D. Yoshida Convolution based smooth approxi- mations to the absolute value function with application to non-smooth regularization, arXiv:1408.6795v2 [math.NA] 1, July, 2015.

[28] X. Wang, F. Liu, L.C. Jiao, J. Wu, and J. Chen, Incomplete variables trun- cated conjugate gradient method for signal reconstruction in compressed sensing, In- formation Sciences, vol. 288, pp. 387-411, 2014.

[29] S. Wright, R. Nowak, and M. Figueiredo, Sparse reconstruction by separable approximation, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, October 2008.

[30] J. Yang and Y. Zhang, Alternating direction algorithms for l1-problems in com- pressive sensing, SIAM Journal on Scientific Computing, vol. 33, pp. 250-278, 2011.

[31] K. Yin, Y.H. Xiao, and M.L. Zhang, Nonlinear conjugate gradient method for l1-norm regularization problems in compressive sensing, Journal of Computational Information Systems, vol. 7, pp. 880-885, 2011.

[32] L. Zhang, W. Zhou, and D. Li, A descent modified Polak-Ribiere-Polyak conju- gate gradient method and its global convergence, IMA Journal of Numerical Analysis, vol. 26, pp. 629-640, 2006.

[33] H. Zhu, Y.H. Xiao, and S.Y. Wu, Large sparse signal recovery by conjugate gradient algorithm based on smoothing technique, Computers and Mathematics with Applications, vol. 66, pp. 24-32, 2013.

tol=1.0e−4 tol=1.0e−5 tol=1.0e−6 tol=1.0e−7 tol=1.0e−8 FCPUReItCPUReItCPUReItCPUReItCPUReIt ψ223.779.62E-0415028.301.98E-0517735.652.11E-0521346.621.91E-0525659.971.58E-05342 ψ321.826.37E-0314224.553.03E-0517130.882.29E-0520636.542.01E-0524140.391.76E-05286 ψ426.647.00E-0314627.034.32E-0516936.792.21E-0521545.122.04E-0525859.341.76E-05338 ψ526.322.21E-0315728.042.77E-0517434.732.12E-0520945.382.14E-0525649.891.88E-05296 ψ619.534.82E-0313824.833.01E-0517232.412.34E-0521438.802.06E-0524550.091.68E-05301 NESTA7.962.28E-025558.302.95E-0256817.123.26E-04113415.981.82E-05106044.501.74E-052999 Table5:ComparisonsoffivesmoothingfunctionsandNESTAfordifferenttolwhennoise-free

tol=1.0e−4 tol=1.0e−5 tol=1.0e−6 tol=1.0e−7 tol=1.0e−8 FCPUReItCPUReItCPUReItCPUReItCPUReIt ψ223.252.49E-0314626.107.37E-0416536.675.91E-0422262.275.80E-04332104.445.95E-04542 ψ319.535.57E-0313819.024.24E-0313631.625.83E-0421553.045.99E-0433280.035.83E-04459 ψ425.113.52E-0316223.848.90E-0315337.375.39E-0422456.385.65E-04314104.555.68E-04520 ψ524.725.91E-0316126.407.22E-0416435.526.31E-0421457.015.75E-0431783.135.83E-04429 ψ620.451.23E-0214822.166.80E-0415633.505.63E-0422256.775.85E-0430894.435.75E-04507 NESTA7.672.13E-025258.281.10E-0254915.817.85E-04108616.457.47E-04107385.896.16E-045886 Table6:ComparisonsoffivesmoothingfunctionsandNESTAfordifferenttolwhennoisycase