Scale Convex Optimization
Paul Tseng
Mathematics, University of Washington Seattle
ICCOPT, McMaster University August 16, 2007
Abstract This is a talk given at ICCOPT 2007.
Talk Outline
• Sensor network localization and SDP, SOCP, ESDP relaxations
• Distributed methods for SOCP and ESDP relaxations
• Distributed method for TV-based image restoration
• Extensions
Sensor Network Localization
Basic Problem
:• n pts in <d (d = 1, 2, 3).
• Know last n − m pts (‘anchors’) xm+1, ..., xn and Eucl. dist. estimate for pairs of ‘neighboring’ pts
dij ≥ 0 ∀(i, j) ∈ A
with A ⊆ {(i, j) : 1 ≤ i < j ≤ n}.
• Estimate first m pts (‘sensors’).
History? Graph realization, position estimation in wireless sensor network, determining protein structures, ...
Optimization Problem Formulation
υopt := min
x1,...,xm
X
(i,j)∈A
kxi − xjk2 − d2ij
2
• Objective function is smooth but nonconvex.. . m can be large (m > 1000).
6
_
• Problem is NP-hard (reduction from PARTITION). 6_. .
• Use a convex (SDP, SOCP) relaxation. High soln accuracy unnecessary.
• Seek “simple” distributed methods (important for practical implementation).
SDP Relaxation
Let X := [x1 · · · xm], A := [xm+1 · · · xn]. Then υopt = min
X,Y
X
(i,j)∈A
tr bijbTijZ − d2ij
2
s.t. Z = Y XT X Id
0, rankZ = d
with bij := Im 0
0 A
(ei − ej).
SDP relaxation (Biswas,Ye ’03):
υsdp := min
X,Y
X
(i,j)∈A
tr bijbTijZ − d2ij
2
s.t. Z = Y XT X Id
0
However, SDP relaxation is expensive to solve for m large..
SOCP Relaxation
υopt = min
x1,...,xm,yij
X
(i,j)∈A
yij − d2ij
2
s.t. yij = kxi − xjk2 ∀(i, j) ∈ A
Relax “=” to “≥” constraint:
υsocp := min
x1,...,xm,yij
X
(i,j)∈A
yij − d2ij
2
s.t. yij ≥ kxi − xjk2 ∀(i, j) ∈ A
= min
x1,...,xm f (x1, ..., xm) := X
(i,j)∈A
max{0, kxi − xjk2 − d2ij}2
This is an unconstrained problem, with f smooth, convex, partially separable.
Solve using a coordinate gradient descent (CGD) method (T, Yun ’06):
• If k∇xif k ≥ tol, then update xi by moving it along
−Hi−1∇xif, with Hi 0 and stepsize by Armijo rule to decrease f, and re-iterate.
Computation is cheap and distributes. Only {xj}(i,j)∈A are needed to update xi. Provable global convergence. Fast convergence in practice.
However, SOCP can be significantly weaker than SDP relaxation..
ESDP Relaxation
Idea: Further relax the constraint Z 0 in SDP relaxation.
ESDP relaxation (Wang, Zheng, Boyd, Ye ’06):
υesdp := min
X,Y
X
(i,j)∈A
tr bijbTijZ − d2ij
2
s.t. Z = Y XT X Id
Yii Yij xTi Yij Yjj xTj xi xj Id
0 ∀(i, j) ∈ A with j ≤ m
Yii xTi xi Id
0 ∀(i, j) ∈ A with j > m
ESDP is stronger than SOCP, weaker than SDP relaxation. In simulation, ESDP is nearly as strong as SDP relaxation, and solvable much faster by SeDuMi. Distributed method?
Distributed Method for Partially Separable SDP
ESDP has the partially separable form
minz h(z) :=
K
X
k=1
hk(z) s.t. Akz+Bk 0, k = 1, ..., K
with Ak very sparse, Bk low-dim., and hk convex, C2, with ∇2hk of the same sparsity pattern as Ak.
KKT Optimality conditions:
∇h(z) − X
k
A∗kΛk = 0,
0 Λk ⊥ Akz + Bk 0, k = 1, ..., K
Unconstrained reformulation:
minz,Λ f (z, Λ) := X
k
ψFB(Akz + Bk, Λk) + k∇h(z) − X
k
A∗kΛkk2
with
ψFB(X, Y ) = k(X2 + Y 2)1/2 − X − Y k2F. Facts: (T ’98, Sim, Sun, Ralph ’06)
• f is smooth, partially separable, nonneg.
• If KKT soln exists, then (z, Λ) is KKT soln ⇐⇒ ∇f (z, Λ) = 0.
Solvable by many methods, but most update all variables at once.
CGD-based distributed method:
• Choose a “small” subset of variables w of (z, Λ). If k∇wf k ≥ tol, then move w along −H−1∇wf, with H 0 and stepsize by Armijo rule to decrease f, and re-iterate.
TV-Based Image Restoration
Total variation-based problem for restoring a noisy image b on Ω ⊂ <2: (Rudin, Osher, Fatemi ’92)
minu
Z
Ω
k∇ukdx + λ Z
Ω
|b − u|2dx
Dual has form:
minw f (w) :=
Z
Ω
|∇ · w − λb|2dx s.t. kwk ≤ 1 a.e. on Ω.
When discretized on a grid, reduces to minimizing a convex, partially separable quad. func. of wij ∈ <2 subject to kwijk ≤ 1.
CGD-based distributed method:
• If kdijk ≥ tol, where
dij := arg min
kwij+dk≤1
(∇wijf )Td + 1
2dTHijd
with Hij 0, then move wij along dij with stepsize by Armijo rule to decrease f, and re-iterate.
If Hij is a multiple of I2, then dij has closed form solution.
Extensions
• Partially asynchronous computation, with constant stepsize?
• Simulation and numerical testing?
• Modifications to find a relative interior soln of ESDP?