# Slides credited from Hsueh-I Lu & Hsu-Chun Hsiao

## Full text

(1)
(2)

### ▪ 3-CNF-SAT

(3)

“A value or quantity that is nearly but not exactly correct”

Approximation algorithms for optimization problems: the approximate solution is guaranteed to be close to the exact solution (i.e., the optimal value)

Cf. heuristics search: no guarantee

Note: we cannot approximate decision problems

The exact answer

error bound

(4)

Most practical optimization problems are NP-hard

It is widely believed that P ≠ NP

Thus, polynomial-time algorithms are unlikely, and we must sacrifice either optimality, efficiency, or generality

Approximation algorithms sacrifice optimality, return near-optimal answers

How “near” is near-optimal?

(5)

-approximation algorithm

Approximation ratio

n: input size

C*: cost of an optimal solution

C: cost of the solution produced by the approximation algorithm

Maximization problem:

Minimization problem:

(6)

Smaller is better ( indicates an exact algorithm)

Challenge: prove that C is close to C* without knowing C*

n: input size

C*: cost of an optimal solution C: cost of an approximate solution

(7)

Textbook 35.1 – The vertex-cover problem

## 7

(8)

A vertex cover of G = (V, E) is a subset V’ ⊆ V s.t. if (w, v) ∈ E, then w ∈ V’ or v ∈ V’

A vertex cover “covers” every edge in G

Optimization problem: find a minimum size vertex cover in G

Decision problem: is there a vertex cover with size smaller than k

NP-complete

(9)

Idea: cover as many edges as possible (vertex with the maximum degree) at each stage and then delete the covered edges

c

b d

a e f g

c

b d

a e f g

c

b d

a e f g

c

b d

a e f g

(10)

Idea: cover as many edges as possible (vertex with the maximum degree) at each stage and then delete the covered edges

The greedy heuristic cannot always find optimal solution (otherwise P=NP is proven)

There is no guarantee that C is always close to C* either

(11)

APPROX-VERTEX-COVER

Randomly select one edge at a time

Remove all incident edges

Running time =

APPROX-VERTEX-COVER(G) C = Ø

E’ = G.E

while E’ ≠ Ø

let (u, v) be an arbitrary edge of E’

C = C ∪ {u, v}

remove from E’ every edge incident on either u or v return C

(12)

APPROX-VERTEX-COVER

Randomly select one edge at a time

Remove all incident edges

c

b d

a e f g

c

b d

a e f g

c b

c

b d

a e f g

d

f

{b, c, d, f} is a vertex cover of size 4 found by the approximation algorithm (not optimal!)

(13)

Theorem. APPROX-VERTEX-COVER is a 2-approximation for the vertex cover problem.

3 things to check

Q1: Does it give a feasible solution?

A feasible solution for vertex cover is a node set that covers all the edges

Finding an optimal solution is hard, but finding a feasible one could be easy

Q2: Does it run in polynomial time?

An exponential-time algorithm is not qualified to be an approximation algorithm

Q3: Does it give an approximate solution with approximation ratio ≤ 2?

Other names: 2-approximate solution, factor-2 approximation

(14)

Suppose that the algorithm runs for k iterations. Let C be the output of APPROX-VERTEX-COVER. Let OPT be any optimal vertex cover of G.

If k = 0, then

If k > 0, then . It suffices to ensure that

Observe that all those k edges (u, v) chosen by APPROX-VERTEX-COVER in those k iterations form a matching of G. Just for OPT (or any feasible solution) to cover this matching requires at least k nodes.

Prove that . That is .

The proof doesn’t require knowing the actual value of C*!

(15)

Tight analysis: check whether we underestimate the quality of the approximate solution obtained by APPROX-VERTEX-COVER

This factor-2 approximation is still the best known approximation algorithm

Reducing to 1.99 is a significant result

Yes, it is tight!

(16)

C is a vertex cover of graph G=(V, E) iff V – C is an independent set of G

Q: Does a 2-approximation algorithm for vertex cover imply a 2- approximation for maximum independent set?

Optimal independent Set: 51 nodes Optimal vertex

cover: 49 nodes

A 2-approximate vertex cover: 98 nodes

2 nodes

(17)

Textbook 35.2 – The traveling-salesman problem

## 17

(18)

Optimization problem: Given a set of cities and their pairwise distances, find a tour of lowest cost that visits each city exactly once.

Inter-city distances satisfy triangle inequality if for all vertices

u v

y x

3

4 5 5 1

3

u v

y x

3

1 1 1 1

1

w/ triangle inequality w/o triangle inequality

(19)

APPROX-TSP-TOUR

Grow an MST from a random root

MST-PRIM

For (n - 1) iterations, add the least-weighted edge incident to the current subtree that does not incur a cycle

Running time =

APPROX-TSP-TOUR(G)

select a vertex r from G.V as a “root” vertex

grow a minimum spanning tree T for G from root r using MST-PRIM(G, d, r)

H = the list of vertices visited in a preorder tree walk of T return C

(20)

H = a, b, c, h, d, e, f, g, a

H* = a, b, c, h, f, g, e, d, a

(21)

Theorem. APPROX-TSP-TOUR is a 2-approximation for the TSP problem.

3 things to check

Q1: Does it give a feasible solution?

A feasible solution is a path of G visiting each cities exactly once

The property of a complete graph is needed

Q2: Does it run in polynomial time?

Q3: Does it give an approximate solution with approximation ratio ≤ 2?

(22)

With triangle inequality:

Let H* denote an optimal tour formed by some tree plus an edge:

Hence,

Prove that . That is .

(23)

Theorem 35.3. If P ≠ NP, there is no polynomial-time approximation algorithm with a constant ratio bound ρ for the general TSP

Proof by contradiction

Suppose there is such an algorithm A with a constant ratio ρ. We will use A to solve HAM-CYCLE in polynomial time.

Algorithm for HAM-CYCLE

Convert G = (V, E) into an instance I of TSP with cities V (resulting in a complete graph G' = (V, E’)):

Run A on I

If the reported cost ≤ ρ|V|, then return “Yes” (i.e., G contains a tour that is an

(24)

Theorem 35.3. If P ≠ NP, there is no polynomial-time approximation algorithm with a constant ratio bound ρ for the general TSP

Analysis

If G has an HC: G’ contains a tour of cost |V| by picking edges in E, each has 1 cost

If G does not have an HC: any tour of G’ must use some edge not in E, which has a total cost

Algorithm A guarantees to return a tour of cost

HAM-CYCLE can be solved in polynomial time, contradiction

A returns a cost if G contains an HC; A returns a cost , otherwise

v y

u

v y

u 1

p

u, y, v, w, x, u is a Hamiltonian Cycle

u, y, v, w, x, u is a traveling- salesman tour with cost |V|

(25)

Show how in polynomial time we can transform one instance of the traveling-

salesman problem into another instance whose cost function satisfies the triangle inequality. The two instances must have the same set of optimal tours. Explain why such a polynomial-time transformation does not contradict Theorem 35.3, assuming that P ≠ NP.

u v

y x

5

1 1 1 1

5

u v

y x

?

? ? ? ?

?

p

(26)

For example, we can add dmax (the largest cost) to each edge

G contains a tour of minimum cost k  G’ contains a tour of minimum cost

G’s satisfies triangle inequality because for all vertices

u v

y x

5

1 1 1 1

5

TSP w/o triangle inequality

u v

y x

5 + dmax

TSP w/ triangle inequality

p 1 + dmax

5 + dmax

1 + dmax 1 + dmax

1 + dmax

dmax= 5

(27)

u v

y x

5

1 1 1 1

5

TSP w/o triangle inequality

u v

y x

5 + dmax

TSP w/ triangle inequality

p 1 + dmax

5 + dmax

1 + dmax 1 + dmax

1 + dmax

dmax= 5

u 10 v

6 6 6 6

approximate

(28)

Textbook 35.3 – The set-covering problem

## 28

(29)

Optimization problem: Given k subsets {S1, S2, …, Sk} of 1, 2, …, n, find an index subset C of {1, 2, …, k} with minimum |C| s.t.

Set cover is NP-complete.

1) It is in NP 2) It is NP-hard

(30)

GREEDY-SET-COVER

At each stage, picking the set S that covers the greatest number of remaining elements that are uncovered

Running time = ?

GREEDY-SET-COVER(S) I = Ø

C = Ø

while C ≠ {1, 2, …, n}

select i be an index maximizing |Si - C|

I = I ∪ {i}

C = C ∪ Si return I

(31)
(32)

Theorem. GREEDY-SET-COVER is a -approximation for the set cover problem.

3 things to check

Q1: Does it give a feasible solution?

A feasible solution output is a collection of subsets whose union is the ground set {1, 2, …, n}.

Q2: Does it run in polynomial time?

Q3: Does it give an approximate solution with ?

(33)

Let I* denote an optimal set cover. We plan to prove that Prove that . That is, .

(34)

For brevity, we re-index those subsets s.t. for each i, Si is the i-th set selected by GREEDY-SET-COVER

Let Ci be the C right before the elements of Si is inserted into C

If an element j is inserted into C in the i-th iteration, the price of j is

The sum of price of all n integers is exactly

(35)

1/3

1/8

1/1

(36)

For brevity, we re-index the integers s.t. they are inserted into C according to the increasing order of these integers

When j is about to be put into C, there are at least n-j+1 uncovered numbers. I* is a collection of sets that can cover these n-j+1 numbers.

There is an index t ϵ I* s.t. St can cover at least uncovered numbers

We have , where j is inserted into C in the i-th iteration.

The price of j is

(37)

The sum of price of all n integers is exactly

The price of j is at most

Therefore, we can prove that

(38)

Textbook 35.4 – Randomization and linear programming

## 38

(39)

Randomized algorithm’s behavior is determined not only by its input but also by values produced by a random-number generator

Exact Approximate

Deterministic MST APPROX-TSP-TOUR

Randomized Quick Sort MAX-3-CNF-SAT

(40)

Decision problem: Satisfiability of Boolean formulas in 3-conjunctive normal form (3-CNF)

3-CNF = AND of clauses, each of which is the OR of exactly 3 distinct literals

A literal is an occurrence of a variable or its negation, e.g., x1 or ¬x1

→ satisfiable

What is the optimization version of 3-CNF-SAT?

(41)

Optimization problem: find an assignment of the variables that satisfies as many clauses as possible

Closeness to optimum is measured by the fraction of satisfied clauses

satisfies 3 clauses satisfies 2 clauses

This clause is always satisfied.

For simplicity, we assume no clause containing both literal and its negation.

(42)

Randomly set each literal to be 0 or 1 (丟硬幣)

Then…

End

Theorem 35.6. Given an instance of MAX-3-CNF-SAT with n variables x1, x2, …, xn and m clauses, the randomized algorithm that independently sets each variable to 1 with probability 1/2 and to 0 with probability 1/2 is a randomized 8/7-

approximation algorithm

(43)

Theorem 35.6. Given an instance of MAX-3-CNF-SAT with n variables x1, x2, …, xn and m clauses, the randomized algorithm that independently sets each variable to 1 with probability 1/2 and to 0 with probability 1/2 is a randomized 8/7-

approximation algorithm

Proof

Each clause is the OR of exactly 3 distinct literals

(satisfying 8/7 of clauses in expectation)

(44)

Most practical optimization problems are NP-hard

It is widely believed that P ≠ NP

Thus, polynomial-time algorithms are unlikely, and we must sacrifice either optimality, efficiency, or generality

Approximation algorithms sacrifice optimality, return near-optimal answers

Maximization problem:

Minimization problem:

(45)

Course Website: http://ada.miulab.tw Email: ada-ta@csie.ntu.edu.tw

## 45

Important announcement will be sent to @ntu.edu.tw mailbox

& post to the course website

Updating...

## References

Related subjects :