Lower Bounds

(1)

Lower Bounds

•

• • Best-, Average-, and Worst-Case Time Complexity

Ex. Insertion sort of x₁, x₂, …, x_n.

Fori = 2, 3, …, n,insert x_i into x₁, x₂, …, x_i−−−−1

such that these i data items are sorted.

input :7, 5, 1, 4, 3, 2, 6 i = 2 : 5, 7

i = 3 : 1, 5, 7 i = 4 : 1, 4, 5, 7 i = 5 : 1, 3, 4, 5, 7 i = 6 : 1, 2, 3, 4, 5, 7 i = 7 : 1, 2, 3, 4, 5, 6, 7

(2)

T(n) : the number of comparisons made

best case :T(n) = O(n) worst case :T(n) = O(n²) average case : T(n) = O(n²)

Consider the insertion of x_i.

P(k) :the probability of making k comparisons

⇒ P(1) = P(2) = … = P(i−−−−2) = ¹ i P(i−−−−1) = ²

i

⇒ the average number of comparisons for xⁱ

= (1+2+ … +(i−−−−2))×××× ¹

i +(i−−−−1)×××× ² i

= i 1 2

+ −−−− ¹ i

The average number of comparisons for x₂, x₃, …, x_nis equal to ( )

n i

i

− i

=

∑

2

1 1 2

+ = O(n²).

(3)

Ex.Binary search of a₁, a₂, …, a_n.

Assume n=2^k−−−−1.

T(n) : the number of comparisons made

best case :T(n) = O(1)

worst case :T(n) = O(logn) average case :T(n) = O(logn)

P(i) :the probability of making i comparisons for a successful search

⇒ P(i) = nⁱ

−1

2 , for i = 1, 2, …, k

The average number of comparisons for a successful search is equal to

( )

k i

i i n

−

=

∑

¹

1

× 2

×

× ₌ n

1 ××××(2^k(k−−−−1)+1) = O(logn).

(4)

( ^k ⁽ ⁱ ⁾

i

i ⁻

=

∑

¹

1

×2

×

× ₌ ₂^k_(k−−−−1)+1 can be proved by

induction on k)

There are k = O(logn) comparisons for each unsuccessful search.

Exercise 1. Analyze the time complexity of the quick sort in the best, average, and worst cases.

(refer to page 32 of the textbook)

(5)

• •

• • Lower Bound for a Problem

A problem has a lower bound of ^Ω^Ω^Ω^Ω(g(n)).

⇒

⇒⇒

⇒ Any algorithm that can solve it takes

ΩΩΩ

Ω(g(n)) time.

For example,sorting n data items requires

Ω ΩΩ

Ω(nlogn) time.

Unless stated otherwise,“lower bound”means

“lower bound in the worst case”.

For a problem,if the time complexity of an algorithm matches a lower bound,then the algorithm is time optimal and the lower bound is tight.

Otherwise (if the lower bound is lower than the time complexity of the algorithm),the lower bound or the algorithm can be improved.

(6)

• •

• • Lower Bound by Comparison Tree

The method of comparison tree is applicable to comparison-based algorithms which make comparisons among input data items.

Most of sorting (exclusive of radix sort and bucket sort), searching, selection, and merging algorithms are comparison-based.

The execution of a comparison-based algorithm can be described by a comparison tree,and the tree depth is the greatest number of comparisons, i.e.,the worst-case time complexity.

⇒

⇒⇒

⇒ The minimal tree depth of all possible comparison trees is a lower bound.

(7)

Ex.Sequential search and binary search of

A(1), A(2), …, A(n) for x.

: (1) x A

...

Failure

Failure Failure : (2)

x A

: ( ) x A n

: ( 1 )

2 x An+ 

 

 

: ( 1 )

4 n x A + 

 

 

3 1

: ( )

4 n x A + 

 

 

: ( 1 1)

2 n x A + 

 −

 

: ( 1 1)

2 n x A + 

 +

 

: (1)

x A

... ...

x: ( )A n

... ...

Failure Failure Failure Failure Failure Failure Failure Failure

⇒

⇒⇒ Searching has a lower bound of ^Ω^Ω^Ω^Ω(logn).

Since binary search takes O(logn) time, the lower bound is tight.

(8)

Ex.Sorting a₁, a₂, …, a_n.

Straight insertion sort of a₁, a₂, a₃:

Sorting3, 1, 2:

3→→→→3, 1→→→→1, 3→→→→1, 3, 2→→→→1, 2, 3 (a₁:a₂) (a₂:a₃) (a₁:a₂)

Sorting2, 1, 3:

2→→→→2, 1→→→→1, 2→→→→1, 2, 3 (a₁:a₂) (a₂:a₃)

(9)

Bubble sort of a₁, a₂, a₃:

Sorting3, 1, 2:

3, 1, 2→→→→1, 3, 2→→→→1, 2, 3 (a₁:a₂) (a₂:a₃) (a₁:a₂)

Sorting2, 1, 3:

2, 1, 3→→→→1, 2, 3 (a₁:a₂) (a₂:a₃)

(10)

one-to-one correspondence

n! leaf nodes ←←←←→→→ → n! possible input sequences

When the comparison tree is balanced,the tree depth is

logn! = ΩΩΩΩ(nlogn) (refer to page 47 of the textbook),

which is minimum.

Heap sort takes O(nlogn) time.

⇒

⇒⇒

⇒ ^Ω^Ω^Ω^Ω(nlogn) is tight for sorting.

The worst-case time complexity of sorting was considered above. In what follows, the average- case time complexity of sorting is considered.

(11)

Average time complexity of a sorting algorithm can be estimated as L

n!,where L is the total length in the comparison tree from the root to each of the leaf nodes.

Let L_min be the minimal L of all comparison trees

⇒

⇒⇒

⇒ Ln

min

! is an average-case lower bound for sorting

L_min occurs when the comparison tree is balanced.

For example,

The left tree has L=13,and the right tree has L=12 (=L_min for n=9).

(12)

Suppose that T is a balanced comparison tree with n! leaf nodes.

Let N = n!+(n!−−−−1) = 2(n!)−−−−1.

⇒

⇒⇒

⇒ T is of heighth = logN.

Assume that there are x₁ leaf nodes of depth h−−−−1 and x₂ leaf nodes of depth h.

⇒⇒⇒ x⇒ 1 + x₂ = n! (A)

x₁ + ¹

2^x²^{= 2}

h−−−−1

(x₂ is even) (B)

(A), (B) ⇒⇒⇒⇒ x1 = 2^h−−−−n!, x₂ = 2(n!−−−−2^h−⁻⁻¹⁻ )

⇒

⇒⇒ Lmin = (2^h−−−−n!)(h−−−−1) + 2(n!−−−−2^h−⁻⁻¹⁻ )h

= (h+1)n! −−−− 2^h

Since logN−−−−1 < h ≤≤≤≤ logN, we have L_min > (logN)n! −−−− 2^log^N

= (logN−−−−2)n! + 1.

(13)

For example, n = 3, N = 11, h = 3, x₁=2, and x₂=4.

⇒

⇒⇒ x1 + ¹

2^x²^{= 2}

2.

Quick sort takes O(nlogn) time in the average case.

⇒

⇒⇒ ΩΩΩΩ(nlogn) is the tight average-case lower bound for sorting.

Since the heap sort takes O(nlogn) time in the worst case,it also takes O(nlogn) time in the average case.

(14)

Ex. Selection from n data items.

Let L(n) denote a lower bound for selecting the greatest data item from a₁, a₂, …, a_n.

Any comparison tree has leaf nodes labeled with a₁, a₂, …, a_n,and each root-to-a_i path represents a process to recognize that a_i is the greatest element.

Since at least n−−−−1 comparisons are required to find the greatest data item, each root-to-leaf path has length ≥≥≥≥ n−−−−1.

⇒⇒⇒

⇒ L(n) = n−−−−1

Exercise 2. Let L_k(n) denote a lower bound for selecting the k greatest data items from a₁, a₂, …, a_n.

Prove that for 2≤≤≤≤k≤≤≤≤n,

L_k(n) ≥≥≥ n≥ −−−−k+logn(n−−−−1)…(n−−−−k+2).

(15)

Ex.An n-player tournament.

An 8-player tennis tournament.

C

C C C A

A B D

H H E

H G E F

C is the best player.

Consider each match a comparison.

⇒ finding the best player is equivalent to

finding the greatest data item.

According to Exercise 2,finding the first two best players requires at least n−−−−2+logn matches

There is an approach to finding the first two best players with exactly n−−−−2+logn matches.

(16)

Consider the 8-player tennis tournament again.

The best player can be found with 7 (=n−−−−1) matches.

The second best player is one of 3 (= logn) candidates,i.e.,D, A, and H.

⇒

⇒⇒ The second best player can be found with

logn −−−−1 matches.

Therefore, n−−−−2+logn is a tight lower bound for finding the first two best players.

(17)

• •

• • Lower Bound by a Particular Problem Instance

Ex. Merging two sorted sequences a₁≤≤≤≤a₂≤≤≤≤ … ≤≤≤≤a_n and b₁≤≤≤≤b₂≤≤≤≤ … ≤≤≤≤b_n.

Consider a problem instance with a₁<b₁<a₂<b₂< … <a_n<b_n.

Whena₁<b₁<a₂<b₂< … <a_i<b_iis obtained, b_i+1 must be compared with a_i+1 and a_i+2 before it is placed properly.

⇒ a lower bound of 2n−−−−1 comparisons

The merging algorithm,which continuously compares the two currently smallest elements of the two sorted lists and outputs the smaller one, performs 2n−−−−1 comparisons.

⇒ the lower bound is tight

(18)

Ex. Fault diameter of the hypercube H_n:the n-dimensional hypercube.

01

00 10

11 1

0

H₁ H₂

001 000 010

011 101

100 110

111

H₃

D_n−−−−1: the (n−−−−1)-fault diameter of H_n, which is the maximal diameter of H_n with n−−−−1 edges removed.

Consider the following problem instance.

...

10ⁿ^-1 0ⁿ 01ⁿ^-1

n−−−−1 edges removed

(19)

The distance between 0ⁿ and 01ⁿ⁻⁻⁻⁻¹ is n+1.

⇒ Dⁿ⁻⁻⁻⁻¹ ≥≥≥≥ n+1

Between any two distinct nodes of H_n,there are n node-disjoint paths whose maximal length is at most n+1.

...

⇒ Dⁿ⁻⁻⁻⁻¹ ≤≤≤≤ n+1

⇒ Therefore,D_n−−−−1 = n+1.

(20)

There is an advanced technique,named oracles,for deriving lower bounds.

In fact, an oracle (e.g., 籤詩籤詩籤詩籤詩) can be considered a scenario for a particular problem instance.

You are suggested to read Sec. 10.2.4 of Ref. (2) (or L. Hyafil, “Bounds for Selection,” SIAM J. Comput., vol. 5, no. 1, 1976, pp. 109-114),where an example of selection is illustrated.

(21)

• •

• • Lower Bound by State Transition

Ex.Finding the maximum and minimum of a₁, a₂, …, a_n.

Let (k, k⁽⁺⁾, k⁽⁻⁻⁻⁻⁾, k^(±^±^±^±)) be a state, where

k: the number of a_i’s that are not compared yet;

k⁽⁺⁾: the number of a_i’s that have won but never lost;

k⁽⁻⁻⁻⁻⁾: the number of a_i’s that have lost but never won;

k^(±^±^±^±): the number of a_i’s that have both won and lost.

The problem is equivalent to the state transition from(n, 0, 0, 0)to(0, 1, 1, n−−−−2).

(22)

Each comparison induces a state transition from (k, k⁽⁺⁾, k⁽⁻⁻⁻⁾⁻ , k^(±^±^±^±)) to one of the following states:

(1) (k−−−−2, k⁽⁺⁾+1, k⁽⁻⁻⁻⁻⁾+1, k^(±^±^±^±));

(2) (k−−−−1, k⁽⁺⁾, k⁽⁻⁻⁻⁻⁾+1, k^(±^±^±^±)) or (k−−−−1, k⁽⁺⁾+1, k⁽⁻⁻⁻⁻⁾, k^(±^±^±^±)) or (k−−−−1, k⁽⁺⁾, k⁽⁻⁻⁻⁾⁻ , k^(±^±^±)^± +1);

(3) (k, k⁽⁺⁾−−−−1, k⁽⁻⁻⁻⁻⁾, k^(±^±^±^±)+1);

(4) (k, k⁽⁺⁾, k⁽⁻⁻⁻⁻⁾−−−−1, k^(±^±^±^±)+1), where

(1) occurs when k≥≥≥≥2 and two from k are compared;

(2) occurs when k≥≥≥≥1 and one from k is compared with one from k⁽⁺⁾ or k⁽⁻⁻⁻⁻⁾;

(3) occurs when k⁽⁺⁾≥≥≥≥2 and two from k⁽⁺⁾ are compared;

(4) occurs when k⁽⁻⁻⁻⁾⁻ ≥≥≥≥2 and two from k⁽⁻⁻⁻⁻⁾ are

(23)

Since the elements ofk^(±^±^±)^± come fromk⁽⁺⁾ork⁽⁻⁻⁻⁻⁾, not fromk, the quickest way from(n, 0, 0, 0)to (0, 1, 1, n−−−−2)is as follows.

Case 1. n=2p.

(n, 0, 0, 0) →→→→_p (0, p, p, 0) →→→→_2p−₋₋₂₋ (0, 1, 1, 2p−−−−2).

(“→→→→_p” meanspstate transitions.)

There are3p−−−−2=(3n/2)−−−−2state transitions.

Case 2. n=2p+1.

(n, 0, 0, 0) →→→→_p (1, p, p, 0) →→→→₁ (0, p, p, 1) →→→→_2p−₋₋₋₂ (0, 1, 1, 2p−−−−1).

There are3p−−−−1=(3n/2)−−−−5/2state transitions.

There is an algorithm that can find the maximum and minimum of n data items with3n/2−−−−2

comparisons. (Refer to Sec. 3.3 of Ref. (2))

Whenn=2p,(3n/2)−−−−2=3n/2−−−−2.

Whenn=2p+1,(3n/2)−−−−5/2<3n/2 −−−−2.

(24)

• •

• • Lower Bound by Reduction

A problem P₁ reduces to another problem P₂, denoted by P₁ ∝∝∝∝ P₂, if any instance of P₁ can be transformed into an instance of P₂ such that the solution for P₁ can be obtained from the

solution for P₂.

T∝∝∝∝ : the reduction time.

T: the time required to obtain the solution for P₁ from the solution for P₂.

Ex. the problem of selection ∝∝∝∝the problem of sorting.

T∝∝∝∝ : O(1).

T: O(1).

(25)

Ex. Suppose that S₁ and S₂ are two sets of n elements and m elements, respectively.

P₁: the problem of determining if S₁∩∩∩∩S₂=∅∅. ∅∅ P₂: the problem of sorting.

T∝∝∝∝ : O(n+m).

T: O(n+m).

S₁={a₁, a₂, …, a_n}, S₂={b₁, b₂, …, b_m}: an arbitrary instance of P₁.

(a₁, 1), (a₂, 1), …, (a_n, 1), (b₁, 2), (b₂, 2), …, (b_m, 2): an instance of P₂ created from P₁.

⇒ S¹∩∩∩∩S₂≠≠≠≠∅∅∅∅iff the sorted sequence contains two successive elements(a_i, 1)and(b_j, 2)

witha_i= b_j.

(26)

L₁: a lower bound of P₁. L₂: a lower bound of P₂.

⇒

⇒⇒

⇒ L1 ≤≤≤≤ T∝∝∝∝ +L₂+T

When L₁,T∝∝∝∝,T are known and T∝∝∝∝ ≤≤≤≤L₁,T≤≤≤≤L₁, we have L₁≤≤≤≤L₂, i.e., L₁is also a lower bound ofP₂.

(27)

Ex.P₁: the sorting problem.

P₂: the convex hull problem.

T∝∝∝∝ : O(n).

T: O(n).

x₁, x₂, …, x_n: an arbitrary instance of P₁.

(x₁, x₁²), (x₂, x₂²), …, (x_n, x_n²): an instance of P₂ from P₁.

(x4, x4 2)

(x3, x3 2)

(x1, x1 2)

(x2, x2 2)

(28)

The sorting problem requires ΩΩΩΩ(nlogn) time.

⇒ The convex hull problem requires ΩΩΩ(nΩ logn) time.

There are O(nlogn) time algorithms for the convex hull problem.

⇒

⇒⇒

⇒ ΩΩΩΩ(nlogn) is tight for the convex hull problem.

(29)

Ex.P₁: the sorting problem.

P₂: the Euclidean minimum spanning tree (E-MST) problem.

T∝∝∝∝ : O(n).

T: O(n).

x₁, x₂, …, x_n: an arbitrary instance of P₁. (x₁, 0), (x₂, 0), …, (x_n, 0): an instance of P₂

from P₁.

...

(x1, 0) (x2, 0) (x3, 0) (xn-1, 0) (xn, 0)

The E-MST problem requires ΩΩΩΩ(nlogn) time.

The E-MST problem can be solved in O(nlogn) time.

⇒⇒

⇒⇒ ΩΩΩ(nΩ logn) is tight for the E-MST problem.

(30)

Exercise 3. For the example of page 25,prove that P₁ has a lower bound of ΩΩΩΩ(nlogn) by showing P₂ ∝∝∝∝ P₁.(Refer to Sec. 10.3.2 on page 475 of Ref. (2).)

Given n data items at intervals of one time step, the on-line median finding problem is to compute the median of the first i data items at the end of the ith time step, where 1≤≤≤≤i≤≤≤≤n.

For example,if the input sequence is (7, 15, 3, 17, 8, 11, 5),

then the output sequence is

(7, 7 or 15, 7, 7 or 15, 8, 8 or 11, 8).

Exercise 4. Prove that the on-line median finding problem has a lower bound of ΩΩΩΩ(nlogn) by showing a reduction from the sorting problem to it.(Refer to Sec. 10.3.3 on

(31)

Project I : Lower Bounds of Some Problems

You are required to survey lower bounds for some problems. You must provide the proofs of lower bounds in your report.