Approximation Algorithms (Part I)

(1)

Approximation Algorithms (Part I)

P : an optimization problem I : an instance ofP

A : an algorithm for solvingP

V(I) : the value forIthat is obtained byA V*(I) : the optimal value forI

Ais anapproximation algorithmforP,if for eachI,Acan generate a feasible solution withV(I) close toV*(I).

(2)

•

• • Classification

Suppose thatAis an approximation algorithm for an optimization problemP.

♣♣♣

♣ Ais anabsolute approximation algorithm forPif and only iffor eachI,

|V*(I)−−−−V(I)|≤≤≤≤c, wherecis a constant.

♣♣

♣♣ Ais anh(n)-approximate algorithm forPif and only iffor eachI,

V * I V I V * I

( ) ( ) ( )

−

≤≤≤≤h(n), wherenis the size ofI.

(3)

♣♣♣

♣ Ais ac-approximate algorithm forPif and only iffor eachI,

V * I V I V * I

( ) ( ) ( )

−−

≤≤≤≤c, wherecis a constant.

Notice that ^{V * I} ^{V I} V * I

( ) ( ) ( )

−−

≤

≤≤ 1, ifPis a maximization problem. Hence,c<1is required for maximization problems.

♣

♣♣

♣ Ais anapproximation schemeforPif and only iffor each givenεεεε_>₀ andfor eachI, Acan generate a feasible solution with

V * I( )⁻⁻⁻⁻V I( )

≤

≤εεεε.

(4)

♣♣

♣♣ Anapproximation schemeis referred to as a polynomial time approximation schemeif and only ifits time complexity is polynomial to the size ofI.

♣♣

♣♣ A polynomial time approximation scheme is further referred to as afully polynomial time

approximation schemeif and only ifits time complexity is also polynomial to ¹

ε εε ε .

For some NP-hard problems,designing efficient and“accurate”approximation algorithms is as hard as designing efficient exact algorithms.

(5)

Ex. 0/1 Knapsack

Instance: A finite setU = {u₁, u₂, …, u_n}, a“size”s(u_i)∈∈∈∈Z⁺anda“value”

v(u_i)∈∈∈∈Z⁺for eachu_i∈∈∈∈U,anda size constraintb∈∈∈∈Z⁺.

Question: What is the subsetU’⊆⊆⊆⊆Usuch that

i

u U' i

∑

s u⁽ ⁾

∈

≤

≤≤ b and

i

u U' i

∑

v u⁽ ⁾

∈

is maximized?

There is an approximation algorithm as follows: examine the elementsu_iin nonincreasing order of ⁱ

i

v u s u

( )

( )andaddu_itoU’if feasible.

For the instance of U={u , u }, v(u )=100,

(6)

Consider the following instance:U={u₁, u₂},

v(u₁)=2,v(u₂)=r,s(u₁)=1,s(u₂)=r,andb=r,

wherer>2.

⇒

⇒⇒

⇒ U’={u₁},V(I)=2,andV*(I)=r.

This algorithm is not an absolute approximation algorithm,because|V*(I)−−−−V(I)|=r−−−−2is not a constant.

This algorithm is a 1-approximate algorithm, because ^{V * I} ^{V I}

V * I ( ) ( )

( )

−

−−

− =

− r²

1 ^≤≤ 1. ^≤^≤

This algorithm is not a c-approximate algorithm for anyc<1.

(7)

•

• • Absolute Approximation

Absolute approximation algorithms are the most desirable approximation algorithms.

However,there are very few NP-hard problems whose polynomial-time absolute approximation algorithms are available.

In particular,designing polynomial-time absolute approximation algorithms for some NP-hard problems was shown NP-hard.

Ex. Consider the problem of finding the minimum numberdof colors needed to color a planar graphG=(V, E).

(8)

Determiningwhetherd=3is NP-hard,ifGis not bipartiteandneitherVnorEis empty.

Since every planar graph is 4-colorable,an easy absolute approximation algorithm with

|V*(I)−−−−V(I)|≤≤≤≤1is as follows.

Step 1. Returnd=0ifVis empty.

Step 2. Returnd=1ifEis empty.

Step 3. Returnd=2ifGis bipartite.

Step 4. Returnd=4.

The time complexity of the algorithm is dominated byStep 3,which takesO(|V|+|E|)time.

(9)

Ex. Consider the following NP-hard problem, which is a restricted subproblem of the well known bin packing problem.

“Given two disks,each of capacityc,and nprograms with storage requirements l₁, l₂, …, l_n,respectively, determine the maximum number of programs that can be stored in the two disks(no program stored in two disks).”

There is anO(nlogn)time absolute approximation algorithm with|V*(I)−−−−V(I)|≤≤≤≤1as follows

Step 1. Arrange the programs in a nondecreasing

(10)

For example,ifn=4,(l₁, l₂, l₃, l₄)=(2, 4, 5, 6), andc=10,then the first two programs are stored in the first disk and the third program is stored in the second disk.

For the example,an optimal solution is to store the first and third(or fourth)disks in one disk andthe other two in the other disk.

♣♣♣

♣ Proof of|V*(I)−−−−V(I)|≤≤≤≤1

Letpbe the maximum number of programs stored in a disk of capacity2c, which happens when programs are stored in a nondecreasing order ofl_i’s.

⇒ V*(I) ≤≤≤≤ p

Assume that thepprograms stored in the disk have storage requirementsl₁≤≤≤≤l₂≤≤≤≤ … ≤≤≤≤l_p,and p’is the greatest index with

p' i i

l

=

∑

1

≤

≤≤

≤ c.

(11)

Since

p

i i p' +

l

−

=

∑

¹

1

≤

≤≤

≤

p i i p' +

l

=

∑

2

≤

≤≤ c, we haveV(I) ≥≥≥≥ p−−−−1

(i.e.,store the firstp’programs in the first disk and the(p’+1)thto(p−−−−1)thprograms to the second disk).

Therefore,|V*(I)−−−−V(I)| ≤≤≤≤ 1.

Whenk≥≥≥≥2disks are used,|V*(I)−−−−V(I)| ≤≤≤≤ k−−−−1 can be proved similarly.

(12)

•

• • NP-Hardness of Absolute Approximation

Ex. 0/1 Knapsack

Instance: A finite setU = {u₁, u₂, …, u_n}, a“size”s(u_i)∈∈∈∈Z⁺anda“value”

v(u_i)∈∈∈∈Z⁺for eachu_i∈∈∈∈U,anda size constraintb∈∈∈∈Z⁺.

Question: What is the subsetU’⊆⊆⊆⊆Usuch that

i

u U' i

∑

s u⁽ ⁾

∈

≤

≤≤ b and

i

u U' i

∑

v u⁽ ⁾

∈

is maximized?

Π Π Π

Π: the problem of designing a polynomial-time absolute approximation algorithm for0/1 Knapsack.

We show below thatΠΠΠΠis NP-hard.

(13)

It suffices to show that if there exists a polynomial- time absolute approximation algorithm for0/1 Knapsack,then0/1 Knapsackcan be solved in polynomial time(i.e.,0/1 Knapsack∝∝∝∝ΠΠΠΠ).

Suppose thatAis a polynomial-time absolute approximation algorithm for0/1 Knapsackwith

|V*(I)−−−−V(I)|≤≤≤≤k,wherekis a constant.

LetIbe any instance of0/1 Knapsack,and I% be the instance of0/1 Knapsackthat is obtained by multiplying each“value”(i.e.,v(u_i))ofIbyk+1.

⇒⇒⇒

⇒ (1) |V*( I% )−−−−V(I% )|is a multiple ofk+1.

(2) Iand I% have the same optimal solution

(14)

When applyingAto I% ,we have

|V*( I% )−−−−V(I% )|≤≤≤≤k,

which together with(1)assuresV(I% )=V*(I% ).

⇒⇒⇒

⇒ Acan generate an optimal solution for I%

(and hence an optimal solution forI).

For example,considerIas follows:

U={u₁, u₂, u₃},(v(u₁), v(u₂), v(u₃))=(1, 2, 3), (s(u₁), s(u₂), s(u₃))=(50, 60, 30),andb=100, for whichU’={u₂, u₃}is the optimal solution andV*(I)=5.

When I% changes(v(u₁), v(u₂), v(u₃))to (1××××5,2××××5,3××××5)=(5, 10, 15),the optimal U’remains the same,butV*( I% )=5××××5=25.

IfAcan guarantee|V*(I% )−−−−V(I% )|≤≤≤≤4,thenA would computeV(I% )=25andoutputU’= {u₂, u₃}for I% , which is also optimal forI.

(15)

Ex. Clique

Instance: An undirected graphG=(V, E).

Question: What is the size of a maximum clique ofG?

Π Π Π

Π: the problem of designing a polynomial-time absolute approximation algorithm for Clique.

In order to show thatΠΠΠΠis NP-hard, it suffices to show that if there exists a polynomial-time absolute approximation algorithmAforClique, thenCliquecan be solved in polynomial time.

Assume thatAguarantees|V*(I)−−−−V(I)|≤≤≤≤kfor each instanceIofClique,wherekis a constant.

(16)

LetIdenote any instanceG=(V, E)ofClique, andI’denote the instanceG’=(V’, E’)ofClique, whereG’containsk+1copies ofGand every two vertices in distinct copies are connected by an edge.

For example, whenk=1,the following graphG

1 2 3 4

will induceG’as follows.

1 2 3 4

1' 2' 3' 4'

⇒

⇒⇒

⇒ The maximum clique inGhas size3and the maximum clique inG’has size2××××3=6.

(17)

In general,Ghas a clique of sizeqif and only if G’has a clique of size(k+1)××××q.

When applyingAtoG’,we have

|V*(I’)−−−−V(I’)|≤≤≤≤k.

Since|V*(I)−−−−V(I)|is a multiple ofk+1,

V(I’)=V*(I’)is implied,i.e.,Acan generate an optimal solution forI’(and hence forI).

(18)

•

• • h(n)-Approximation

Ex. Givenmidentical processors,denoted byP_i (1≤≤≤≤i≤≤≤≤m),andnjobs,denoted byJ_k(1≤≤≤≤k≤≤≤≤n), a schedule is to assign each job with a time interval and a processor for processing.

Each job is not allowed to be processed by more than one processor at the same time.

A schedule isnonpreemptiveif each job is required to be processed continuously from start to end by the same processor,andpreemptiveelse.

t_k : the amount of processing time required for J_k;

F_i : the time whenP_i completes the processing of all the jobs assigned to it.

It is an NP-hard problem to find a nonpreemptive schedule that can minimize max{F_i| 1≤≤≤≤i≤≤≤≤m}.

(19)

AnLPT(longest processing time)scheduleis to assign a free job with the longest processing time to a processor whenever it becomes available.

For example,the following is anLPTschedule form=3,n=6,and(t₁, t₂, t₃, t₄, t₅, t₆)=(8, 7, 6, 5, 4, 3).

6 7 8 11

J1

P1

J2

J3 J4

J6

J5

P2

P3

time

⇒

⇒⇒

⇒ max{F1, F₂, F₃}=11,which is minimized.

(20)

AnLPTschedule for m=3,n=7,and(t₁, t₂, t₃, t₄, t₅, t₆, t₇)=(5, 5, 4, 4, 3, 3, 3)is shown below.

4 5 8 11

J1

P1

J2

J3 J4

J6

J5

P2

P3

time J7

0

This schedule hasmax{F₁, F₂, F₃}=11,which is not optimal.An optimal schedule is shown as follow.

5 9

J1

P1

J2

J3

J4

J6

J5

P2

P3

time J7

0

For this instance,we have V * I V I

V * I ( ) ( )

( )

−

− = (11−−−−9)/9 = 2/9.

(21)

It takesO(nlogn)time to generate anLPT schedule,which can guarantee

V * I V I V * I

( ) ( ) ( )

−

− ≤≤≤≤

− m

1 1

3 3 .

The upper bound is tight,because it equals ² 9 as m=3(refer to the instance above).

♣

♣♣ Proof of ^{V * I} ^{V I} V * I

( ) ( ) ( )

−

− ≤≤≤≤

− m

1 1

3 3

Whenm=1,the inequality holds(V*(I)=V(I)).

So,assumem≥≥≥≥2below.

Suppose that the inequality is violated for somem and(t₁, t₂, …, t_n).Besides,it is assumed thatnis minimum, while violating the inequality.

(22)

f_k : the time when the processing ofJ_kis finished.

⇒

⇒⇒

⇒ max{fi | 1≤≤≤≤i≤≤≤≤n} = max{F_i| 1≤≤≤≤i≤≤≤≤m}.

♦♦♦

♦ We first showf_n=max{f_i| 1≤≤≤≤i≤≤≤≤n}below.

Supposef_n’=max{f_i| 1≤≤≤≤i≤≤≤≤n},wheren’<n.

⇒

⇒⇒

⇒ V(I’)=f_n’=max{F_i| 1≤≤≤≤i≤≤≤≤m}=V(I),

whereI’denotes the instance of(t₁, t₂, …, t_n’) andIdenotes the instance of(t₁, t₂, …, t_n).

SinceV*(I’)≤≤≤≤V*(I),we have V * I' V I'

V * I'

( ) ( ) ( )

−

− ≥≥≥≥ ^{V * I} ^{V I} V * I

( ) ( ) ( )

−

−−

− (>

− m

1 1

3 3 ), which contradicts the assumption ofn.

♦♦♦

♦ Next we showV*(I)<3××××t_nbelow,

meaning that at most two jobs are assigned to each processor in an optimal schedule forI.

(23)

Sincef_n=V(I),the processing ofJ_nstarts at time V(I)−−−−t_n,i.e.,all processors are busy between time 0and timeV(I)−−−−t_n.

⇒

⇒⇒

⇒ V(I)−−−−t_n ≤≤≤≤ ⁿ _i m i

∑

⁻¹t

=1

1 ××××

⇒ V(I) ≤≤≤≤ ⁿ _i m i

∑

t

=1

1 ×××× + ^m _n m⁻¹^×^×^×^×t

⇒ V(I) ≤≤≤≤ V*(I)+ ^m _n

m⁻¹^×^×^×^×t ( ⁿ _i m i

∑

t

=1

1 ×××× ≤≤≤≤ V*(I))

⇒ ^{V * I} ^{V I} V * I

( ) ( ) ( )

−−

= ^{V I} V * I

( )

1⁻⁻⁻⁻ ( ) ≤≤≤≤ ⁽ ⁾

* ( ) m n

m V I

t

−1 ^×^×^×^×

×

Now that ^{V * I} ^{V I} V * I

( ) ( ) ( )

−−−

− >

− m

1 1

3 3 , we have

− m

1 1

3 3 < ⁽ ⁾

* ( ) m n

m V I

t

−1 ^×^×^×^×

××

×× ,

(24)

It was proved that anLPTschedule is optimal, if any optimal schedule has at most two jobs assigned to each processor(refer to page 568of Ref. (2)).

⇒

⇒⇒ ^{V * I} ^{V I} V * I

( ) ( ) ( )

−−

= 0,

contradicting the assumption aboutI!

(25)

•

• • c-Approximation

Ex. Bin Packing

Instance: A capacity valueb>0anda finite setUof items whose sizes do not exceedb.

Question: What is the minimum number of bins of equal capacitybto put away all items ofU(each item is placed entirely in one bin)?

For example,givenb=10andsix items of sizes 5,6,3,7,5and4,an optimal packing is shown

(26)

Four heuristics forBin Packing:

•••

• First Fit(FF):

Arrange the bins in sequence and place each item into the first bin in which it fits.

•

••

• Best Fit(BF):

Arrange the bins in sequence and place each item into the most nearly full bin in which it fits.

••

•• First Fit Decreasing(FFD):

the same asFF,except that the items are placed in a nonincreasing sequence of sizes.

•

••

• Best Fit Decreasing(BFD):

the same asBF,except that the items are placed in a nonincreasing sequence of sizes.

(27)

For the example ofb=10andsizes5,6,3,7,5,4,

FF:

5 3

4

6 7 5

BF:

5

3

4

6 7

5

FFD, BFD:

3 4 5

(28)

Properties:

(P1) ForFForBF,V(I) ≤≤≤≤ ¹⁷

10 ××××V*(I)+2;

(P2) ForFFDorBFD,V(I) ≤≤≤≤ ¹¹

9 ××××V*(I)+4.

The proofs of(P1)and(P2),which are rather lengthy and complex,can be found below.

Johnson, Demers, Ullman, Garey, and Graham,

“Worst-Case Performance Bounds for Simple One-Dimensional Packing Algorithms,” SIAM Journal on Computing, vol. 3, no. 4, 1974, 299-325.

The upper bound of(P1)was further improved to

¹⁷

10 ××××V*(I)below.

Garey, Graham, Johnson, and Yao, “Resource Constrained Scheduling as Generalized Bin Packing,” Journal of Combinatorial Theory (A), vol. 21, 1976, 257-298.

(29)

♣♣♣

♣ A simple proof ofV(I)<2××××V*(I) (i.e., ^{V * I} ^{V I}

V * I ( ) ( )

( )

−

− <1)forFF

AssumeU={u₁, u₂, …, u_n},and lets_ibe the size ofu_i,where1≤≤≤≤i≤≤≤≤n.

WhenV(I)=1,we haveV*(I)=1(V(I)<2××××V*(I)).

We considerV(I)>1below.

For evenV(I),

s₁+s₂+ … +s_n> ^{V I}^{( )} 2 ××××b

⇒

⇒⇒

⇒ V*(I)≥≥≥≥ ^{V I}^{( )} 2 +1

⇒⇒⇒

⇒ V(I)≤≤≤≤2××××V*(I)−−−−2.

For oddV(I),

(30)

Ex. Traveling Salesman Problem(TSP)

Instance: A set C of m citiesanddistances d_i,j>0for all pairs of citiesi, j∈∈∈∈C.

Question: What is the length of a shortest tour that starts at any city,visits each of the other m−−−−1 cities exactly once, and returns to the initial city?

When alld_i,j’ssatisfy triangle inequality,i.e., d_i,j≤≤≤≤d_i,k+d_k,jfor alli, j, k∈∈∈∈C,the metric TSP results.

The Euclidean TSP,whered_i,jis the geodesic distance betweeniandj,is an instance of the metric TSP.

Boththe metric TSPandthe Euclidean TSPare NP-hard.

(31)

♣ A heuristic withV(I)<2××××V*(I)

For example,

1

2

3

4

5

6

Step 1. Find anMST.

1

2

3

4

5

6

Letd*be the total distance of it.

(32)

Step 2. Replace each edge of the tree with two opposite arcs.

1

2

3

4

5

6

Step 3. Construct an Euler circuit of total distance2××××d*on the graph above.

1→→→→2→→→→3→→→→2→→→→4→→→→6→→→→4→→→→5→→→→4→→→→2→→→→1

Step 4. Arbitrarily find a feasible tour from the circuit above.

1→→→→2→→→→3 −−−−−−−−−−−−→→→→ 4→→→→6 −−−−−−−−−−−→−→→→ 5 −−−−−−−−−−−−−−−−−−−−−−−−→→→→ 1

The feasible tour has

V(I) < 2××××d* < 2××××V*(I).

(33)

The time complexity is dominated byStep 1, which takesO(nlogn)time for the Euclidean TSP.

♣

♣♣

♣ A heuristic withV(I) < ³

2 ××××V*(I)

Consider the same example above.

Step 1. Find anMST.

1

2

3

4

5

6

Letd*be the total distance of it.

(34)

Step 2. Find a maximum matching with minimum cost on the set of odd-degree vertices.

1

2

3

4

5

6

Step 3. Add the edges of the matching to the MST.

(All the vertices are of even degrees.)

1

2

3

4

5

6

Step 4. Construct an Euler circuit on the graph above.

1→→→→3→→→→2→→→→4→→→→6→→→→5→→→→4→→→→2→→→→1

Step 5. Arbitrarily find a feasible tour from the circuit above.

1→→→→3→→→→2→→→→4→→→→6→→→→5−−−−−−−−−−−−−−−−−−−−−−−−−−−−→→→→1

(35)

The time complexity is dominated byStep 2, which takesO(m³)time for the Euclidean TSP (refer toCombinatorial Optimization:Networks and Matroids,byE. L. Lawler,1976).

Suppose that there are2kodd-degree vertices: a₁, a₂, …, a_2k,in the MST.

(It is also assumed that these2kvertices appear in some shortest tour with the same sequence.)

M₁={(a₁, a₂), (a₃, a₄), …, (a_2k−−−−1, a_2k)}

M₂={(a₂, a₃), (a₄, a₅), …, (a_2k−−−−2, a_2k−−−−1), (a_2k, a₁)}

(M₁ and M₂ are two maximum matchings.) c₁ (c₂) : the cost ofM₁ (M₂)

(36)

c* : the minimum cost of a maximum matching on{a₁, a₂, …, a_2k}

⇒⇒⇒⇒ 2××××c* ≤≤≤≤ c₁ + c₂ (3)

l : the total distance of the Euler circuit

⇒

⇒⇒ l = c*+d* (4) andV(I) ≤≤≤≤ l (5) (1), (2), (3), (4), (5) ⇒⇒⇒⇒ V(I) < ³

2 ××××V*(I)

Exercise 11. ReadSec. 9-1ofthe textbook.

(1) Illustrate the approximation algorithm by an example.

(2) ShowV(I) ≤≤≤ 2≤ ××××V*(I).

Exercise 12. ReadSec. 9-6ofthe textbook.

(1) Illustrate the approximation algorithm by an example.

(2) ShowV(I) ≤≤≤ 2≤ ××××V*(I).

(37)

•

• • NP-Hardness of c-Approximation

Ex. TSP

LetΠΠΠΠbe the problem of designing a c-approximate algorithmforTSP.

We showHamiltonian Cycle∝∝∝∝ΠΠΠΠ below.

Hamiltonian Cycle

Instance: An undirected graph G=(V, E).

Question: Does G contain a Hamiltonian cycle, i.e.,an ordering(v₁, v₂,…,v_|V|)of the vertices of G such that (v₁, v_|V|)∈∈∈∈E and

(v_i, v_i+1)∈∈∈∈Efor all 1≤≤≤≤i<|V|?