Lower Bounds
•
•
•
• Best-, Average-, and Worst-Case Time Complexity
Ex. Insertion sort of x1, x2, …, xn.
For i = 2, 3, …, n, insert xi into x1, x2, …, xi−−−−1
such that these i data items are sorted.
input : 7, 5, 1, 4, 3, 2, 6 i = 2 : 5, 7
i = 3 : 1, 5, 7 i = 4 : 1, 4, 5, 7 i = 5 : 1, 3, 4, 5, 7 i = 6 : 1, 2, 3, 4, 5, 7 i = 7 : 1, 2, 3, 4, 5, 6, 7
T(n) : the number of comparisons made
best case : T(n) = O(n) worst case : T(n) = O(n2) average case : T(n) = O(n2)
Consider the insertion of xi.
P(k) : the probability of making k comparisons
⇒ P(1) = P(2) = … = P(i−−−−2) = 1 i P(i−−−−1) = 2
i
⇒ the average number of comparisons for xi
= (1+2+ … +(i−−−−2))×××× 1
i +(i−−−−1)×××× 2 i
= i 1 2
+ −−−− 1 i
The average number of comparisons for x2, x3, …, xn is equal to ( )
n i
i
− i
=
∑
2
1 1 2
+ = O(n2).
Ex. Binary search of a1, a2, …, an.
Assume n=2k −−−−1.
T(n) : the number of comparisons made
best case : T(n) = O(1)
worst case : T(n) = O(logn) average case : T(n) = O(logn)
P(i) : the probability of making i comparisons for a successful search
⇒ P(i) = ni
−1
2 , for i = 1, 2, …, k
The average number of comparisons for a successful search is equal to
( )
k i
i i n
−
=
∑
11
× 2
×
×
× = n
1 ××××(2k(k−−−−1)+1) = O(logn).
( k ( i )
i
i −
=
∑
11
×2
×
×
× = 2k(k−−−−1)+1 can be proved by
induction on k)
There are k = O(logn) comparisons for each unsuccessful search.
Exercise 1. Analyze the time complexity of the quick sort in the best, average, and worst cases.
(refer to page 32 of the textbook)
• •
• • Lower Bound for a Problem
A problem has a lower bound of ΩΩΩΩ(g(n)).
⇒
⇒⇒
⇒ Any algorithm that can solve it takes
ΩΩΩ
Ω(g(n)) time.
For example, sorting n data items requires
Ω ΩΩ
Ω(nlogn) time.
Unless stated otherwise, “lower bound” means
“lower bound in the worst case”.
For a problem, if the time complexity of an algorithm matches a lower bound, then the algorithm is time optimal and the lower bound is tight.
Otherwise (if the lower bound is lower than the time complexity of the algorithm), the lower bound or the algorithm can be improved.
• •
• • Lower Bound by Comparison Tree
The method of comparison tree is applicable to comparison-based algorithms which make comparisons among input data items.
Most of sorting (exclusive of radix sort and bucket sort), searching, selection, and merging algorithms are comparison-based.
The execution of a comparison-based algorithm can be described by a comparison tree, and the tree depth is the greatest number of comparisons, i.e., the worst-case time complexity.
⇒
⇒⇒
⇒ The minimal tree depth of all possible comparison trees is a lower bound.
Ex. Sequential search and binary search of
A(1), A(2), …, A(n) for x.
: (1) x A
...
Failure
Failure
Failure Failure : (2)
x A
: ( ) x A n
: ( 1 )
2 x An+
: ( 1 )
4 n x A +
3 1
: ( )
4 n x A +
: ( 1 1)
2 n x A +
−
: ( 1 1)
2 n x A +
+
: (1)
x A
... ...
x: ( )A n... ...
Failure Failure Failure Failure Failure Failure Failure Failure
⇒
⇒
⇒⇒ Searching has a lower bound of ΩΩΩΩ(logn).
Since binary search takes O(logn) time, the lower bound is tight.
Ex. Sorting a1, a2, …, an.
Straight insertion sort of a1, a2, a3 :
Sorting 3, 1, 2:
3 →→→→ 3, 1 →→→→ 1, 3 →→→→ 1, 3, 2 →→→→ 1, 2, 3 (a1 :a2) (a2 :a3) (a1 :a2)
Sorting 2, 1, 3:
2 →→→→ 2, 1 →→→→ 1, 2 →→→→ 1, 2, 3 (a1 :a2) (a2 :a3)
Bubble sort of a1, a2, a3 :
Sorting 3, 1, 2:
3, 1, 2 →→→→ 1, 3, 2 →→→→ 1, 2, 3 (a1 :a2) (a2 :a3) (a1 :a2)
Sorting 2, 1, 3:
2, 1, 3 →→→→ 1, 2, 3 (a1 :a2) (a2 :a3)
one-to-one correspondence
n! leaf nodes ←←←←→→→ → n! possible input sequences
When the comparison tree is balanced, the tree depth is
logn! = ΩΩΩΩ(nlogn) (refer to page 47 of the textbook),
which is minimum.
Heap sort takes O(nlogn) time.
⇒
⇒⇒
⇒ ΩΩΩΩ(nlogn) is tight for sorting.
The worst-case time complexity of sorting was considered above. In what follows, the average- case time complexity of sorting is considered.
Average time complexity of a sorting algorithm can be estimated as L
n!, where L is the total length in the comparison tree from the root to each of the leaf nodes.
Let Lmin be the minimal L of all comparison trees
⇒
⇒⇒
⇒ Ln
min
! is an average-case lower bound for sorting
Lmin occurs when the comparison tree is balanced.
For example,
The left tree has L=13, and the right tree has L=12 (=Lmin for n=9).
Suppose that T is a balanced comparison tree with n! leaf nodes.
Let N = n!+(n!−−−−1) = 2(n!)−−−−1.
⇒
⇒⇒
⇒ T is of height h = logN.
Assume that there are x1 leaf nodes of depth h−−−−1 and x2 leaf nodes of depth h.
⇒⇒⇒ x⇒ 1 + x2 = n! (A)
x1 + 1
2x2 = 2
h−−−−1
(x2 is even) (B)
(A), (B) ⇒⇒⇒⇒ x1 = 2h−−−−n!, x2 = 2(n!−−−−2h−−−1− )
⇒
⇒
⇒⇒ Lmin = (2h−−−−n!)(h−−−−1) + 2(n!−−−−2h−−−1− )h
= (h+1)n! −−−− 2h
Since logN−−−−1 < h ≤≤≤≤ logN, we have Lmin > (logN)n! −−−− 2logN
= (logN−−−−2)n! + 1.
For example, n = 3, N = 11, h = 3, x1 =2, and x2 =4.
⇒
⇒
⇒⇒ x1 + 1
2x2 = 2
2.
Quick sort takes O(nlogn) time in the average case.
⇒
⇒
⇒⇒ ΩΩΩΩ(nlogn) is the tight average-case lower bound for sorting.
Since the heap sort takes O(nlogn) time in the worst case, it also takes O(nlogn) time in the average case.
Ex. Selection from n data items.
Let L(n) denote a lower bound for selecting the greatest data item from a1, a2, …, an.
Any comparison tree has leaf nodes labeled with a1, a2, …, an, and each root-to-ai path represents a process to recognize that ai is the greatest element.
Since at least n−−−−1 comparisons are required to find the greatest data item, each root-to-leaf path has length ≥≥≥≥ n−−−−1.
⇒⇒⇒
⇒ L(n) = n−−−−1
Exercise 2. Let Lk(n) denote a lower bound for selecting the k greatest data items from a1, a2, …, an.
Prove that for 2≤≤≤≤k≤≤≤≤n,
Lk(n) ≥≥≥ n≥ −−−−k+logn(n−−−−1)…(n−−−−k+2).
Ex. An n-player tournament.
An 8-player tennis tournament.
C
C C C A
A B D
H H E
H G E F
C is the best player.
Consider each match a comparison.
⇒ finding the best player is equivalent to
finding the greatest data item.
According to Exercise 2, finding the first two best players requires at least n−−−−2+logn matches
There is an approach to finding the first two best players with exactly n−−−−2+logn matches.
Consider the 8-player tennis tournament again.
The best player can be found with 7 (=n−−−−1) matches.
The second best player is one of 3 (= logn) candidates, i.e., D, A, and H.
⇒
⇒
⇒⇒ The second best player can be found with
logn −−−−1 matches.
Therefore, n−−−−2+logn is a tight lower bound for finding the first two best players.
• •
• • Lower Bound by a Particular Problem Instance
Ex. Merging two sorted sequences a1 ≤≤≤≤a2 ≤≤≤≤ … ≤≤≤≤an and b1 ≤≤≤≤b2 ≤≤≤≤ … ≤≤≤≤bn.
Consider a problem instance with a1 <b1 <a2 <b2 < … <an <bn.
When a1 <b1 <a2 <b2 < … <ai <bi is obtained, bi+1 must be compared with ai+1 and ai+2 before it is placed properly.
⇒ a lower bound of 2n−−−−1 comparisons
The merging algorithm, which continuously compares the two currently smallest elements of the two sorted lists and outputs the smaller one, performs 2n−−−−1 comparisons.
⇒ the lower bound is tight
Ex. Fault diameter of the hypercube Hn: the n-dimensional hypercube.
01
00 10
11 1
0
H1 H2
001 000 010
011 101
100 110
111
H3
Dn−−−−1: the (n−−−−1)-fault diameter of Hn, which is the maximal diameter of Hn with n−−−−1 edges removed.
Consider the following problem instance.
...
10n-1 0n 01n-1n−−−−1 edges removed
The distance between 0n and 01n−−−−1 is n+1.
⇒ Dn−−−−1 ≥≥≥≥ n+1
Between any two distinct nodes of Hn, there are n node-disjoint paths whose maximal length is at most n+1.
...
...
...
...
⇒ Dn−−−−1 ≤≤≤≤ n+1
⇒ Therefore, Dn−−−−1 = n+1.
There is an advanced technique, named oracles, for deriving lower bounds.
In fact, an oracle (e.g., 籤詩籤詩籤詩籤詩) can be considered a scenario for a particular problem instance.
You are suggested to read Sec. 10.2.4 of Ref. (2) (or L. Hyafil, “Bounds for Selection,” SIAM J. Comput., vol. 5, no. 1, 1976, pp. 109-114), where an example of selection is illustrated.
• •
• • Lower Bound by State Transition
Ex. Finding the maximum and minimum of a1, a2, …, an.
Let (k, k(+), k(−−−−), k(±±±±)) be a state, where
k: the number of ai’s that are not compared yet;
k(+): the number of ai’s that have won but never lost;
k(−−−−): the number of ai’s that have lost but never won;
k(±±±±): the number of ai’s that have both won and lost.
The problem is equivalent to the state transition from (n, 0, 0, 0) to (0, 1, 1, n−−−−2).
Each comparison induces a state transition from (k, k(+), k(−−−)− , k(±±±±)) to one of the following states:
(1) (k−−−−2, k(+)+1, k(−−−−)+1, k(±±±±));
(2) (k−−−−1, k(+), k(−−−−)+1, k(±±±±)) or (k−−−−1, k(+) +1, k(−−−−), k(±±±±)) or (k−−−−1, k(+), k(−−−)− , k(±±±) ± +1);
(3) (k, k(+) −−−−1, k(−−−−), k(±±±±) +1);
(4) (k, k(+), k(−−−−) −−−−1, k(±±±±) +1), where
(1) occurs when k≥≥≥≥2 and two from k are compared;
(2) occurs when k≥≥≥≥1 and one from k is compared with one from k(+) or k(−−−−);
(3) occurs when k(+)≥≥≥≥2 and two from k(+) are compared;
(4) occurs when k(−−−)− ≥≥≥≥2 and two from k(−−−−) are
Since the elements of k(±±±)± come from k(+) or k(−−−−), not from k, the quickest way from (n, 0, 0, 0) to (0, 1, 1, n−−−−2) is as follows.
Case 1. n=2p.
(n, 0, 0, 0) →→→→p (0, p, p, 0) →→→→2p−−−2− (0, 1, 1, 2p−−−−2).
(“→→→→p” means p state transitions.)
There are 3p−−−−2=(3n/2)−−−−2 state transitions.
Case 2. n=2p+1.
(n, 0, 0, 0) →→→→p (1, p, p, 0) →→→→1 (0, p, p, 1) →→→→2p−−−−2 (0, 1, 1, 2p−−−−1).
There are 3p−−−−1=(3n/2)−−−−5/2 state transitions.
There is an algorithm that can find the maximum and minimum of n data items with 3n/2−−−−2
comparisons. (Refer to Sec. 3.3 of Ref. (2))
When n=2p, (3n/2)−−−−2 = 3n/2−−−−2.
When n=2p+1, (3n/2)−−−−5/2 < 3n/2 −−−−2.
• •
• • Lower Bound by Reduction
A problem P1 reduces to another problem P2, denoted by P1 ∝∝∝∝ P2, if any instance of P1 can be transformed into an instance of P2 such that the solution for P1 can be obtained from the
solution for P2.
T∝∝∝∝ : the reduction time.
T: the time required to obtain the solution for P1 from the solution for P2.
Ex. the problem of selection ∝∝∝∝ the problem of sorting.
T∝∝∝∝ : O(1).
T: O(1).
Ex. Suppose that S1 and S2 are two sets of n elements and m elements, respectively.
P1 : the problem of determining if S1 ∩∩∩∩S2 =∅∅. ∅∅ P2 : the problem of sorting.
T∝∝∝∝ : O(n+m).
T: O(n+m).
S1 ={a1, a2, …, an}, S2 ={b1, b2, …, bm}: an arbitrary instance of P1.
(a1, 1), (a2, 1), …, (an, 1), (b1, 2), (b2, 2), …, (bm, 2): an instance of P2 created from P1.
⇒ S1 ∩∩∩∩S2 ≠≠≠≠∅∅∅∅ iff the sorted sequence contains two successive elements (ai, 1) and (bj, 2)
with ai = bj.
L1 : a lower bound of P1. L2 : a lower bound of P2.
⇒
⇒⇒
⇒ L1 ≤≤≤≤ T∝∝∝∝ +L2 +T
When L1, T∝∝∝∝, T are known and T∝∝∝∝ ≤≤≤≤L1, T≤≤≤≤L1, we have L1 ≤≤≤≤L2, i.e., L1 is also a lower bound of P2.
Ex. P1 : the sorting problem.
P2 : the convex hull problem.
T∝∝∝∝ : O(n).
T: O(n).
x1, x2, …, xn : an arbitrary instance of P1.
(x1, x12), (x2, x22), …, (xn, xn2): an instance of P2 from P1.
(x4, x4 2)
(x3, x3 2)
(x1, x1 2)
(x2, x2 2)
The sorting problem requires ΩΩΩΩ(nlogn) time.
⇒ The convex hull problem requires ΩΩΩ(nΩ logn) time.
There are O(nlogn) time algorithms for the convex hull problem.
⇒
⇒⇒
⇒ ΩΩΩΩ(nlogn) is tight for the convex hull problem.
Ex. P1 : the sorting problem.
P2 : the Euclidean minimum spanning tree (E-MST) problem.
T∝∝∝∝ : O(n).
T: O(n).
x1, x2, …, xn : an arbitrary instance of P1. (x1, 0), (x2, 0), …, (xn, 0): an instance of P2
from P1.
...
(x1, 0) (x2, 0) (x3, 0) (xn-1, 0) (xn, 0)
The E-MST problem requires ΩΩΩΩ(nlogn) time.
The E-MST problem can be solved in O(nlogn) time.
⇒⇒
⇒⇒ ΩΩΩ(nΩ logn) is tight for the E-MST problem.
Exercise 3. For the example of page 25, prove that P1 has a lower bound of ΩΩΩΩ(nlogn) by showing P2 ∝∝∝∝ P1. (Refer to Sec. 10.3.2 on page 475 of Ref. (2).)
Given n data items at intervals of one time step, the on-line median finding problem is to compute the median of the first i data items at the end of the ith time step, where 1≤≤≤≤i≤≤≤≤n.
For example, if the input sequence is (7, 15, 3, 17, 8, 11, 5),
then the output sequence is
(7, 7 or 15, 7, 7 or 15, 8, 8 or 11, 8).
Exercise 4. Prove that the on-line median finding problem has a lower bound of ΩΩΩΩ(nlogn) by showing a reduction from the sorting problem to it. (Refer to Sec. 10.3.3 on
Project I : Lower Bounds of Some Problems
You are required to survey lower bounds for some problems. You must provide the proofs of lower bounds in your report.