NP-Completeness :
Concepts
•
•
•
• Why Studying NP-Completeness ?
♣ Pursuing your Ph.D.
♣ Keeping your job
Before studying NP-completeness:
“I can’t find an efficient algorithm, I guess I’m just too dumb.”
After studying NP-completeness:
“I can’t find an efficient algorithm, because no such algorithm is possible!”
“I can’t find an efficient algorithm, but neither can all these famous people.”
•
•
•
• Measure to Time Complexity
l: measure to the time complexity of an algorithm
The discussion of NP-completeness considers l the input size, i.e., the total length of all inputs to the algorithm.
Two assumptions:
(1) all inputs are integers (a rational number can be represented by a pair of integers);
(2) each integer has a binary representation.
Ex. Sorting a1, a1, …, an.
l = n ( i
)
i
a
=
+∑
2
1
log 1 .
Ex. Consider the following procedure.
input(n);
s ←←←← 0;
for i ←←← 1 to n← do s ←←←← s+i;
output(s).
l = log2n+1.
The procedure takes O(n)=O(2l) time.
⇒ an exponential-time algorithm !
•
•
•
• Polynomial-Time Algorithms
vs.
Exponential-Time Algorithms
Suppose that your computer takes 1 second to perform 106 operations.
The following is the time requirement for your computer to perform f(n) operations, where f(n) = n, n2, n3, n5, 2n, 3n and n = 10, 20, 30, 40, 50, 60.
The following shows the largest value of n such that f(n) operations can be performed in 1 hour on a faster computer.
An algorithm is referred to as a polynomial-time algorithm, if its time complexity can be bounded above by a polynomial function of input size.
An algorithm is referred to as an exponential-time algorithm, if its time complexity cannot be thus bounded (even if the function is not normally regarded as an exponential one, like nlogn).
Usually, a problem is referred to as tractable if it can be solved with a polynomial-time algorithm, and intractable otherwise.
The two tables above give us a reason why polynomial-time algorithms are much more desirable than exponential-time algorithms.
They also motive us to study the theory of NP-completeness.
•
•
•
• Maximal
vs.
Maximum
Ex.
maximal cliques : {1, 2, 3}, {2, 3, 4, 5}, {4, 6}
maximum cliques : {2, 3, 4, 5}
•
•
•
• Decision Problems
vs.
Optimization Problems
A decision problem asks a solution of “yes” or
“no”.
An optimization problem asks a solution of an optimal value (a maximum or a minimum).
Ex. The maximum clique problem can be expressed as a decision problem as follows.
Instance: An undirected graph G=(V, E) and a positive integer k≤≤≤≤|V|.
Question: Does G contain a clique of size≥≥≥≥k?
It can be also expressed as an optimization problem as follows.
Instance: An undirected graph G=(V, E).
Question: What is the size of a maximum clique of G?
Ex. The traveling salesman problem can be expressed as a decision problem as follows.
Instance: A set C of m cities, distances di,j >0 for all pairs of cities i, j∈∈∈∈C, and a positive integer k.
Question: Is there a tour of length≤≤≤≤k that starts at any city, visits each of the other m−−−−1 cities exactly once, and returns to the initial city?
It can be also expressed as an optimization problem as follows.
Instance: A set C of m cities and distances di,j >0 for all pairs of cities i, j∈∈∈∈C.
Question: What is the length of a shortest tour that starts at any city, visits each of the other m−−−−1 cities exactly once, and returns to the initial city?
Ex. The problem of sorting a1, a1, …, an can be expressed as a decision problem as follows.
Instance: Given a1, a2, …, an and a positive integer k.
Question: Is there a permutation of a1, a2, …, an, denoted by a’1, a’2, …, a’n, such that
|a’2 −−−−a’1|+|a’3 −−−−a’2|+ … +|a’n −−−−a’n−−−−1|≤≤≤≤k?
An optimization problem is “harder” than its corresponding decision problem.
Since the NP-completeness concerns whether or not a problem can be solved in polynomial time, the discussion of NP-completeness considers only decision problems.
(If a decision problem is not polynomial-time solvable, then its corresponding optimization problem is not polynomial-time solvable either.)
•
•
•
• Problem Reduction
A problem P1 reduces to another problem P2, denoted by P1 ∝∝∝∝ P2, if any instance of P1 can be transformed into an instance of P2 such that the solution for P1 can be obtained from the
solution for P2.
T∝∝∝∝ : the reduction time.
T: the time required to obtain the solution for P1 from the solution for P2.
Since the NP-completeness concerns whether or not a problem can be solved in polynomial time, we consider only the reductions with both T∝∝∝∝ and T polynomial.
(Thus, P2 ∈∈∈∈P ⇒⇒⇒⇒ P1 ∈∈∈∈P or P1 ∉∉∉∉P ⇒⇒⇒⇒ P2 ∉∉∉∉P.) If P1 ∝∝∝∝P2 and P2 ∝∝∝∝P3, then P1 ∝∝∝∝P3.
•
•
•
• P,
NP,
and
NP-Complete
Three classes of decision problems: P, NP, and NP-complete.
P: the set of decision problems that can be solved in polynomial time by deterministic algorithms.
NP: the set of decision problems that can be solved in polynomial time by non-deterministic
algorithms.
Any non-deterministic algorithm consists of two phases: guessing and checking.
For the maximum clique problem, the guessing phase will return a clique, and the checking phase will decide whether or not the clique size is greater than or equal to k.
For the traveling salesman problem, the guessing phase will return a tour, and the checking phase will decide whether or not the tour length is greater than or equal to k.
A decision problem has an AFFIRMATIVE answer.
⇔⇔
⇔⇔ The guessing is SUCCESSFUL.
Notice that non-deterministic algorithms are imaginary. A more detailed description of non- deterministic algorithms and more illustrative examples can be found in Ref. (2).
Every decision problem in P is also in NP, i.e., P ⊆⊆⊆⊆ NP.
An NP problem is NP-complete if every NP problem can reduce to it in polynomial time.
⇒
⇒⇒
⇒ If any NP-complete problem can be solved in polynomial time, then every NP problem can be solved in polynomial time (i.e., P=NP).
(Intuitively, NP-complete problems are the
“hardest” problems in NP.)
It is one of the most famous open problems in computer science whether P≠≠≠≠NP or P=NP.
When P≠≠≠≠NP,
P
NP
NP-Complete
(There exist problems in NP that are neither in P, nor in NP-complete (see Chap.7 in Ref. (1).)
When P=NP,
P = NP = NP-Complete
Almost all people believe P≠≠≠≠NP.
A problem is NP-hard if an NP-complete problem can be reduced to it in polynomial time.
(Equivalently, a problem is NP-hard if every NP problem can be reduced to it in polynomial time.)
⇒
⇒⇒
⇒ If any NP-hard problem can be solved in polynomial time, then all NP-complete problems can be solved in polynomial time.
(Intuitively, NP-hard problems are “harder”
than NP-complete problems.)
NP NP-hard
NP-complete
The class of NP-hard problems contains both decision problems and optimization problems.
If an NP-hard problem is in NP, then it is an NP-complete problem.
(Intuitively, NP-complete problems are an “easier”
subclass of NP-hard problems.)
The corresponding optimization problems of NP-complete problems are NP-hard.
The well-known halting problem (a decision problem), which is to determine whether or not an algorithm will terminate with a given input, is NP-hard, but not NP-complete.
•
•
•
• Pseudo-Polynomial Time Algorithms
Ex. Given a set S={a1, a1, …, an} of integers and an integer M>0, the sum-of-subset problem is to determine whether or not there exists a subset
of S whose sum is equal to M.
This problem can be solved in O(nM) time by dynamic programming as follows.
Let t(i, j)=true, if there exists a subset of {a1, a2, …, ai} whose sum is equal to j, and false else.
Then,
t(i, j) = t(i−−−−1, j)+t(i−−−−1, j−−−−ai), where i>1.
Initially, t(1, j)=true, if j=0 or j=a1, and false else.
The answer is t(n, M).
Although the time complexity is exponential with respect to M, the problem is considered polynomial-time solvable, if M is bounded.
An algorithm like this is usually referred to as a pseudo-polynomial time algorithm.
An NP-complete problem is in the strong sense if and only if there exists no pseudo-polynomial time algorithm for solving it (unless P=NP).
Intuitively, NP-complete problems in the strong sense are “harder” NP-complete problems (refer to Ref.(1)).
•
•
•
• The Satisfiability Problem and Cook’s Theorem
The satisfiability problem, which is the first NP-complete problem, is defined as follows.
Instance: A set U of Boolean variables and a collection C of clauses over U.
Question: Is there an assignment of U that can satisfy C?
Ex. When U={x1, x2, x3} and C={x1 ∨∨∨∨x2 ∨∨∨∨x3, x1, x
2}, the assignment of U: x1 ←←←←F, x2 ←←←←F and x3 ←←←←T, can satisfy C (i.e., (x1 ∨∨∨∨x2 ∨∨∨∨x3)∧∧∧∧(x
1)∧∧∧∧(x
2) = T).
Ex. When U={x1, x2} and C={x1 ∨∨∨∨x2, x1 ∨∨∨∨ x
2, x1 ∨∨∨∨x2, x
1 ∨∨∨∨ x
2}, no assignment of U can satisfy C.
Cook’s Theorem: The satisfiability problem is NP-complete.
The proof of Cook’s Theorem, which is rather lengthy and complex, can be found in Ref.(1) and Ref.(2).
There is an informal proof of Cook’s Theorem in the textbook.
•
•
•
• Six Basic NP-Complete Problems
(P1) 3-Satisfiability
Instance: A set U of variables and a collection C={c1, c2, …, cm} of clauses over U, where each clause of C contains three literals.
Question: Is there a satisfying truth assignment for C?
Ex. When U = {x1, x2, x3} and C = {x1 ∨∨∨∨x2 ∨∨∨∨x3, x1∨∨∨∨x
2∨∨∨∨x3}, the assignment of U: x1 ←←←←T, x2 ←←←←F and x3 ←←←←F, can satisfy C.
Ex. When U = {x1, x2, x3} and C = {x1 ∨∨∨∨x2 ∨∨∨∨x3, x
1∨∨∨∨x2 ∨∨∨∨x3, x1 ∨∨∨∨x
2∨∨∨∨x3, x1 ∨∨∨∨x2 ∨∨∨∨x
3, x
1∨∨∨∨x
2∨∨∨∨x3, x
1∨∨∨∨x2 ∨∨∨∨x
3, x1 ∨∨∨∨x
2∨∨∨∨x
3, x
1∨∨∨∨x
2∨∨∨∨x
3}, no assignment of U can satisfy C.
(P2) Vertex Cover
Instance: An undirected graph G=(V, E) and a positive integer k≤≤≤≤|V|.
Question: Does G contain a vertex cover of size at most k, i.e., a subset V’⊆⊆⊆⊆V such that |V’|≤≤≤≤k and for each (u, v)∈∈∈∈E, at least one of u and v belongs to V’?
Ex.
|V’| = 4, 5 ⇒⇒⇒⇒ V’ is a vertex cover;
|V’| = 3: {1, 2, 3}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, and {2, 3, 5} are vertex covers;
|V’| < 3 ⇒⇒⇒⇒ V’ is not a vertex cover.
(P3) 3-Dimensional Matching
Instance: A set M⊆⊆⊆⊆W××××X××××Y, where W, X and Y are three disjoint sets, each having q elements.
Question: Does M contain a matching, i.e., a subset M’⊆⊆⊆⊆M such that each element of W, X and Y appears in M’ exactly
once (|M’|=q)?
Ex. Suppose W={a, b}, X={c, d}, and Y={e, f}.
If M={(a, c, f), (b, d, e), (a, d, f)}, then M contains a matching M’={(a, c, f), (b, d, e)}.
If M={(a, c, f), (b, c, e), (b, d, f)}, then M
does not contain a matching.
(P4) Clique
Instance: An undirected graph G=(V, E) and a positive integer k≤≤≤≤|V|.
Question: Does G contain a clique of size at least k, i.e., a subset V’⊆⊆⊆⊆V such
that |V’|≥≥≥≥k and every two vertices of V’ are adjacent in G?
Ex.
|V’| = 4, 5 ⇒⇒⇒⇒ V’ is not a clique;
|V’| = 3: {1, 2, 3} is a clique;
|V’| = 2: {1, 2}, {1, 3}, {2, 3}, {3, 4} and {3, 5} are cliques.
(P5) Hamiltonian Cycle
Instance: An undirected graph G=(V, E).
Question: Does G contain a Hamiltonian cycle, i.e., an ordering (v1, v2, …, v|V|) of the vertices of G such that (v1, v|V|)∈∈∈∈
E and (vi, vi+1)∈∈∈∈E for all 1≤≤≤≤i<|V|?
Ex.
The left graph has a Hamiltonian cycle, but the right graph does not.
(P6) Partition
Instance: A multiset A={a1, a2, …, a|A|} of positive integers.
Question: Does there exist A’⊆⊆⊆⊆A such that
i
a A' i
∑
a∈
∈
∈
∈
=
j
a A A' j
a
−
∑
∈
∈
∈
∈
?
Ex. The multiset {2, 2, 4, 4, 8} can be divided into
{2, 4, 4} and {2, 8} whose sums are equal.
On the other hand, {2, 2, 4, 4, 7} cannot be divided similarly.
The six NP-complete problems above were shown in Ref.(1) in the following way, where each “→→→→” represents a reduction “∝∝∝∝” (for example, Vertex Cover ∝∝∝∝ Clique).
Satisfiability
3-Satisfiability
3-Dimensional Matching
Partition
Vertex Cover
Clique Hamiltonian
Cycle
It is still possible to show these NP-complete problems (and others) in a different way, i.e., using different known NP-complete problems.
A list of NP-complete problems can be found in Appendix of Ref.(1).
•
•
•
• Two-Sided Analysis of Problems
If some restrictions are imposed on a problem ΠΠΠΠ, then a restricted subproblem ΠΠΠΠ’ of ΠΠΠΠ results.
Suppose ΠΠΠΠ, ΠΠΠΠ’∈∈∈∈NP and P≠≠≠≠NP.
ΠΠΠ
Π’ is NP-complete ⇒⇒⇒⇒ ΠΠΠΠ is NP-complete.
Π ΠΠ
Π is NP-complete ⇒⇒⇒ ⇒ ΠΠΠΠ’ is in P or NP-complete or neither.
Π Π Π Π Π
Π ΠΠ’
(“→→→→” means “a subproblem of”)
The frontier is narrowed down, if some open problems are shown to be in P or NP-complete.
Ex. Let d be the maximal vertex degree in G.
Both Vertex Cover and Hamiltonian Cycle are in P if d≤≤≤≤2, and NP-complete if d≥≥≥≥3.
Ex. Graph 3-Colorability
Instance: An undirected graph G=(V, E).
Question: Is G 3-colorable, i.e., does there exist a function f: V →→→→ {1, 2, 3}
such that f(u)≠≠≠≠f(v) for all edges (u, v)∈∈∈∈E?
Graph 3-Colorability is in P if d≤≤≤≤3, and NP-complete if d≥≥≥≥4 or G is planar.
Ex.
Ex. Precedence Constrained Scheduling Instance: A set T of “tasks”, each of
“length” 1, a partial order p on T, a “deadline” d, and m “processors”.
Question: Is there a “schedule” f: T→→→→{0, 1, …, d} such that f(t)<f(t’) if tp t’,
and for each i∈∈∈∈{0, 1, …, d},
|{t∈∈∈∈T: f(t)=i}|≤≤≤≤m?
•
•
•
• Coping with NP-Hard Problems
optimal polynomial solution ? time ?
greedy (heuristic) not yes
algorithms guaranteed
dynamic yes experimentally
programming & efficient
branch-and-bound algorithms
genetic algorithms & not experimentally ant algorithms guaranteed efficient approximation a guaranteed yes
algorithms error bound (exclusive of approximation
schemes) randomized a high probability yes algorithms
or
yes a high probability
average polynomial yes in average case time algorithm