Analyzing divide-and-conquer algorithms

Third Edition

2.3 Designing algorithms

2.3.2 Analyzing divide-and-conquer algorithms

When an algorithm contains a recursive call to itself, we can often describe its running time by a recurrence equation or recurrence, which describes the overall running time on a problem of size n in terms of the running time on smaller inputs.

We can then use mathematical tools to solve the recurrence and provide bounds on the performance of the algorithm.

7We shall see in Chapter 3 how to formally interpret equations containing ‚-notation.

8The expressiondxe denotes the least integer greater than or equal to x, and bxc denotes the greatest integer less than or equal to x. These notations are deﬁned in Chapter 3. The easiest way to verify that setting q tob.p C r/=2c yields subarrays AŒp : : q and AŒq C 1 : : r of sizes dn=2e and bn=2c, respectively, is to examine the four cases that arise depending on whether each of p and r is odd or even.

5 2 4 7 1 3 2 6

2 5 4 7 1 3 2 6

2 4 5 7 1 2 3 6

1 2 2 3 4 5 6 7

merge merge

merge

sorted sequence

initial sequence

merge merge

Figure 2.4 The operation of merge sort on the array A D h5; 2; 4; 7; 1; 3; 2; 6i. The lengths of the sorted sequences being merged increase as the algorithm progresses from bottom to top.

A recurrence for the running time of a divide-and-conquer algorithm falls out from the three steps of the basic paradigm. As before, we let T .n/ be the running time on a problem of size n. If the problem size is small enough, say n c for some constant c, the straightforward solution takes constant time, which we write as ‚.1/. Suppose that our division of the problem yields a subproblems, each of which is 1=b the size of the original. (For merge sort, both a and b are 2, but we shall see many divide-and-conquer algorithms in which a ¤ b.) It takes time T .n=b/ to solve one subproblem of size n=b, and so it takes time aT .n=b/

to solve a of them. If we take D.n/ time to divide the problem into subproblems and C.n/ time to combine the solutions to the subproblems into the solution to the original problem, we get the recurrence

T .n/ D

(‚.1/ if n c ;

aT .n=b/ C D.n/ C C.n/ otherwise :

In Chapter 4, we shall see how to solve common recurrences of this form.

Analysis of merge sort

Although the pseudocode for MERGE-SORT works correctly when the number of elements is not even, our recurrence-based analysis is simpliﬁed if we assume that

the original problem size is a power of 2. Each divide step then yields two subse-quences of size exactly n=2. In Chapter 4, we shall see that this assumption does not affect the order of growth of the solution to the recurrence.

We reason as follows to set up the recurrence for T .n/, the worst-case running time of merge sort on n numbers. Merge sort on just one element takes constant time. When we have n > 1 elements, we break down the running time as follows.

Divide: The divide step just computes the middle of the subarray, which takes constant time. Thus, D.n/ D ‚.1/.

Conquer: We recursively solve two subproblems, each of size n=2, which con-tributes 2T .n=2/ to the running time.

Combine: We have already noted that the MERGE procedure on an n-element subarray takes time ‚.n/, and so C.n/ D ‚.n/.

When we add the functions D.n/ and C.n/ for the merge sort analysis, we are adding a function that is ‚.n/ and a function that is ‚.1/. This sum is a linear function of n, that is, ‚.n/. Adding it to the 2T .n=2/ term from the “conquer”

step gives the recurrence for the worst-case running time T .n/ of merge sort:

T .n/ D (

‚.1/ if n D 1 ;

2T .n=2/ C ‚.n/ if n > 1 : (2.1)

In Chapter 4, we shall see the “master theorem,” which we can use to show that T .n/ is ‚.n lg n/, where lg n stands for log₂n. Because the logarithm func-tion grows more slowly than any linear funcfunc-tion, for large enough inputs, merge sort, with its ‚.n lg n/ running time, outperforms insertion sort, whose running time is ‚.n²/, in the worst case.

We do not need the master theorem to intuitively understand why the solution to the recurrence (2.1) is T .n/ D ‚.n lg n/. Let us rewrite recurrence (2.1) as T .n/ D

(

c if n D 1 ;

2T .n=2/ C cn if n > 1 ; (2.2)

where the constant c represents the time required to solve problems of size 1 as well as the time per array element of the divide and combine steps.⁹

9It is unlikely that the same constant exactly represents both the time to solve problems of size 1 and the time per array element of the divide and combine steps. We can get around this problem by letting c be the larger of these times and understanding that our recurrence gives an upper bound on the running time, or by letting c be the lesser of these times and understanding that our recurrence gives a lower bound on the running time. Both bounds are on the order of n lg n and, taken together, give a ‚.n lg n/ running time.

Figure 2.5 shows how we can solve recurrence (2.2). For convenience, we as-sume that n is an exact power of 2. Part (a) of the ﬁgure shows T .n/, which we expand in part (b) into an equivalent tree representing the recurrence. The cn term is the root (the cost incurred at the top level of recursion), and the two subtrees of the root are the two smaller recurrences T .n=2/. Part (c) shows this process carried one step further by expanding T .n=2/. The cost incurred at each of the two sub-nodes at the second level of recursion is cn=2. We continue expanding each node in the tree by breaking it into its constituent parts as determined by the recurrence, until the problem sizes get down to 1, each with a cost of c. Part (d) shows the resulting recursion tree.

Next, we add the costs across each level of the tree. The top level has total cost cn, the next level down has total cost c.n=2/ C c.n=2/ D cn, the level after that has total cost c.n=4/ C c.n=4/ C c.n=4/ C c.n=4/ D cn, and so on. In general, the level i below the top has 2ⁱ nodes, each contributing a cost of c.n=2ⁱ/, so that the i th level below the top has total cost 2ⁱc.n=2ⁱ/ D cn. The bottom level has n nodes, each contributing a cost of c, for a total cost of cn.

The total number of levels of the recursion tree in Figure 2.5 is lg n C 1, where n is the number of leaves, corresponding to the input size. An informal inductive argument justiﬁes this claim. The base case occurs when n D 1, in which case the tree has only one level. Since lg 1 D 0, we have that lg n C 1 gives the correct number of levels. Now assume as an inductive hypothesis that the number of levels of a recursion tree with 2ⁱ leaves is lg 2ⁱ C 1 D i C 1 (since for any value of i , we have that lg 2ⁱ D i ). Because we are assuming that the input size is a power of 2, the next input size to consider is 2^{i C1}. A tree with n D 2^{i C1} leaves has one more level than a tree with 2ⁱ leaves, and so the total number of levels is .i C 1/ C 1 D lg 2^{i C1}C 1.

To compute the total cost represented by the recurrence (2.2), we simply add up the costs of all the levels. The recursion tree has lg n C 1 levels, each costing cn, for a total cost of cn.lg n C 1/ D cn lg n C cn. Ignoring the low-order term and the constant c gives the desired result of ‚.n lg n/.

Exercises

2.3-1

Using Figure 2.4 as a model, illustrate the operation of merge sort on the array A D h3; 41; 52; 26; 38; 57; 9; 49i.

2.3-2

Rewrite the MERGE procedure so that it does not use sentinels, instead stopping once either array L or R has had all its elements copied back to A and then copying the remainder of the other array back into A.

…

Total: cn lg n + cn cn lg n

c c c c c c c

…

(d)

T(n/2) T(n/2)

(b) T(n)

(a)

cn/2

T(n/4) T(n/4)

cn/2

T(n/4) T(n/4)

cn/2

cn/4 cn/4

cn/2

cn/4 cn/4

Figure 2.5 How to construct a recursion tree for the recurrence T .n/ D 2T .n=2/ C cn.

Part (a) shows T .n/, which progressively expands in (b)–(d) to form the recursion tree. The fully expanded tree in part (d) has lg n C 1 levels (i.e., it has height lg n, as indicated), and each level contributes a total cost of cn. The total cost, therefore, is cn lg n C cn, which is ‚.n lg n/.

2.3-3

Use mathematical induction to show that when n is an exact power of 2, the solu-tion of the recurrence

T .n/ D (

2 if n D 2 ;

2T .n=2/ C n if n D 2^k, for k > 1 is T .n/ D n lg n.

2.3-4

We can express insertion sort as a recursive procedure as follows. In order to sort AŒ1 : : n, we recursively sort AŒ1 : : n 1 and then insert AŒn into the sorted array AŒ1 : : n 1. Write a recurrence for the running time of this recursive version of insertion sort.

2.3-5

Referring back to the searching problem (see Exercise 2.1-3), observe that if the sequence A is sorted, we can check the midpoint of the sequence against and eliminate half of the sequence from further consideration. The binary search al-gorithm repeats this procedure, halving the size of the remaining portion of the sequence each time. Write pseudocode, either iterative or recursive, for binary search. Argue that the worst-case running time of binary search is ‚.lg n/.

2.3-6

Observe that the while loop of lines 5–7 of the INSERTION-SORT procedure in Section 2.1 uses a linear search to scan (backward) through the sorted subarray AŒ1 : : j 1. Can we use a binary search (see Exercise 2.3-5) instead to improve the overall worst-case running time of insertion sort to ‚.n lg n/?

2.3-7 ?

Describe a ‚.n lg n/-time algorithm that, given a set S of n integers and another integer x, determines whether or not there exist two elements in S whose sum is exactly x.

Problems

2-1 Insertion sort on small arrays in merge sort

Although merge sort runs in ‚.n lg n/ worst-case time and insertion sort runs in ‚.n²/ worst-case time, the constant factors in insertion sort can make it faster in practice for small problem sizes on many machines. Thus, it makes sense to coarsen the leaves of the recursion by using insertion sort within merge sort when

subproblems become sufﬁciently small. Consider a modiﬁcation to merge sort in which n=k sublists of length k are sorted using insertion sort and then merged using the standard merging mechanism, where k is a value to be determined.

a. Show that insertion sort can sort the n=k sublists, each of length k, in ‚.nk/

worst-case time.

b. Show how to merge the sublists in ‚.n lg.n=k// worst-case time.

c. Given that the modiﬁed algorithm runs in ‚.nk C n lg.n=k// worst-case time, what is the largest value of k as a function of n for which the modiﬁed algorithm has the same running time as standard merge sort, in terms of ‚-notation?

d. How should we choose k in practice?

2-2 Correctness of bubblesort

Bubblesort is a popular, but inefﬁcient, sorting algorithm. It works by repeatedly swapping adjacent elements that are out of order.

BUBBLESORT.A/

1 fori D 1 to A:length 1

2 forj D A:length downto i C 1 3 ifAŒj < AŒj 1

4 exchange AŒj with AŒj 1

a. Let A⁰denote the output of BUBBLESORT.A/. To prove that BUBBLESORTis correct, we need to prove that it terminates and that

A⁰Œ1 A⁰Œ2 A⁰Œn ; (2.3)

where n D A: length. In order to show that BUBBLESORTactually sorts, what else do we need to prove?

The next two parts will prove inequality (2.3).

b. State precisely a loop invariant for the for loop in lines 2–4, and prove that this loop invariant holds. Your proof should use the structure of the loop invariant proof presented in this chapter.

c. Using the termination condition of the loop invariant proved in part (b), state a loop invariant for the for loop in lines 1–4 that will allow you to prove in-equality (2.3). Your proof should use the structure of the loop invariant proof presented in this chapter.

d. What is the worst-case running time of bubblesort? How does it compare to the running time of insertion sort?

2-3 Correctness of Horner’s rule

The following code fragment implements Horner’s rule for evaluating a polynomial P .x/ D

Xn kD0

akx^k

D a₀C x.a₁C x.a₂C C x.a_n1C xa_n/ // ; given the coefﬁcients a0; a1; : : : ; anand a value for x:

1 y D 0

2 fori D n downto 0 3 y D aiC x y

a. In terms of ‚-notation, what is the running time of this code fragment for Horner’s rule?

b. Write pseudocode to implement the naive polynomial-evaluation algorithm that computes each term of the polynomial from scratch. What is the running time of this algorithm? How does it compare to Horner’s rule?

c. Consider the following loop invariant:

At the start of each iteration of the for loop of lines 2–3, y D

n.i C1/X

kD0

akCi C1x^k:

Interpret a summation with no terms as equaling 0. Following the structure of the loop invariant proof presented in this chapter, use this loop invariant to show that, at termination, y DPn

kD0a_kx^k.

d. Conclude by arguing that the given code fragment correctly evaluates a poly-nomial characterized by the coefﬁcients a0; a1; : : : ; an.

2-4 Inversions

Let AŒ1 : : n be an array of n distinct numbers. If i < j and AŒi > AŒj , then the pair .i; j / is called an inversion of A.

a. List the ﬁve inversions of the array h2; 3; 8; 6; 1i.

b. What array with elements from the setf1; 2; : : : ; ng has the most inversions?

How many does it have?

c. What is the relationship between the running time of insertion sort and the number of inversions in the input array? Justify your answer.

d. Give an algorithm that determines the number of inversions in any permutation on n elements in ‚.n lg n/ worst-case time. (Hint: Modify merge sort.)

Chapter notes

In 1968, Knuth published the ﬁrst of three volumes with the general title The Art of Computer Programming [209, 210, 211]. The ﬁrst volume ushered in the modern study of computer algorithms with a focus on the analysis of running time, and the full series remains an engaging and worthwhile reference for many of the topics presented here. According to Knuth, the word “algorithm” is derived from the name “al-Khowˆarizmˆı,” a ninth-century Persian mathematician.

Aho, Hopcroft, and Ullman [5] advocated the asymptotic analysis of algo-rithms—using notations that Chapter 3 introduces, including ‚-notation—as a means of comparing relative performance. They also popularized the use of re-currence relations to describe the running times of recursive algorithms.

Knuth [211] provides an encyclopedic treatment of many sorting algorithms. His comparison of sorting algorithms (page 381) includes exact step-counting analyses, like the one we performed here for insertion sort. Knuth’s discussion of insertion sort encompasses several variations of the algorithm. The most important of these is Shell’s sort, introduced by D. L. Shell, which uses insertion sort on periodic subsequences of the input to produce a faster sorting algorithm.

Merge sort is also described by Knuth. He mentions that a mechanical colla-tor capable of merging two decks of punched cards in a single pass was invented in 1938. J. von Neumann, one of the pioneers of computer science, apparently wrote a program for merge sort on the EDVAC computer in 1945.

The early history of proving programs correct is described by Gries [153], who credits P. Naur with the ﬁrst article in this ﬁeld. Gries attributes loop invariants to R. W. Floyd. The textbook by Mitchell [256] describes more recent progress in proving programs correct.

The order of growth of the running time of an algorithm, deﬁned in Chapter 2, gives a simple characterization of the algorithm’s efﬁciency and also allows us to compare the relative performance of alternative algorithms. Once the input size n becomes large enough, merge sort, with its ‚.n lg n/ worst-case running time, beats insertion sort, whose worst-case running time is ‚.n²/. Although we can sometimes determine the exact running time of an algorithm, as we did for insertion sort in Chapter 2, the extra precision is not usually worth the effort of computing it. For large enough inputs, the multiplicative constants and lower-order terms of an exact running time are dominated by the effects of the input size itself.

When we look at input sizes large enough to make only the order of growth of the running time relevant, we are studying the asymptotic efﬁciency of algorithms.

That is, we are concerned with how the running time of an algorithm increases with the size of the input in the limit, as the size of the input increases without bound.

Usually, an algorithm that is asymptotically more efﬁcient will be the best choice for all but very small inputs.

This chapter gives several standard methods for simplifying the asymptotic anal-ysis of algorithms. The next section begins by deﬁning several types of “asymp-totic notation,” of which we have already seen an example in ‚-notation. We then present several notational conventions used throughout this book, and ﬁnally we review the behavior of functions that commonly arise in the analysis of algorithms.

在文檔中 ALGORITHMS INTRODUCTION TO (頁 55-64)