• 沒有找到結果。

1.1Reductions Recursion C 1

N/A
N/A
Protected

Academic year: 2022

Share "1.1Reductions Recursion C 1"

Copied!
41
0
0

加載中.... (立即查看全文)

全文

(1)

The control of a large force is the same principle as the control of a few men:

it is merely a question of dividing up their numbers.

— Sun Zi, The Art of War (c. 400C.E.), translated by Lionel Giles (1910) My lover gave me nine linked rings.

With my two hands I could not untangle them, I could not untangle them.

My lover, please untangle my nine linked rings, nine linked rings.

I will marry you and you will be my man.

— “Nine Linked Rings”,Chinese folk song(before 1800) Our life is frittered away by detail. . . . Simplify, simplify.

— Henry David Thoreau, Walden (1854) Nothing is particularly hard if you divide it into small jobs.

— Henry Ford Do the hard jobs first. The easy jobs will take care of themselves.

— Dale Carnegie

CHAPTER 1

Recursion

Status: Beta. Explain domain transformations?

1.1 Reductions

Reductionis the single most common technique used in designing algorithms. Reducing one problem X to another problem Y means to write an algorithm for X that uses an algorithm for Y as a black box or subroutine. Crucially, the correctness of the resulting algorithm cannot depend in any way on how the algorithm for Y works. The only thing we can assume is that the black box solves Y correctly. The inner workings of the black box are simply none of our business; they’re somebody else’s problem. It’s often best to literally think of the black box as functioning purely by magic.

For example, the peasant multiplication algorithm described in the previous chapter reduces the problem of multiplying two arbitrary positive integers to three simpler problems: addition, mediation (halving), and parity-checking. The algorithm relies on an abstract “positive integer” data type that supports those three operations, but the correctness of the multiplication algorithm does not depend on the precise data

© Copyright 2018 Jeff Erickson.

This work is licensed under a Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/4.0/).

(2)

representation (tally marks, clay tokens, Babylonian hexagesimal, quipu, counting rods, Roman numerals, abacus beads, finger positions, Arabic numerals, binary, negabinary, Gray code, balanced ternary, Fibonacci coding, . . . ), or on the precise implementations of those operations. Of course, the running time of the multiplication algorithm depends on the running time of the addition, median, and parity operations, but that’s a separate issue from correctness. Most importantly, we can create a more efficient multiplication algorithm just by switching to a more efficient number representation (from Roman numerals to Arabic numerals, for example).

Similarly, the Huntington-Hill algorithm reduces the problem of apportioning Con- gress to the problem of maintaining a priority queue that supports the operations Insert and ExtractMax. The abstract data type “priority queue” is a black box; the correctness of the apportionment algorithm does not depend on any specific priority queue data structure. Of course, the running time of the apportionment algorithm depends on the running timeof the Insert and ExtractMax algorithms, but that’s a separate issue from the correctness of the algorithm. The beauty of the reduction is that we can create a more efficient apportionment algorithm by simply swapping in a new priority queue data structure. Moreover, the designer of that data structure does not need to know or care that it will be used to apportion Congress.

When we design algorithms, we may not know exactly how the basic building blocks we use are implemented, or how our algorithms might be used as building blocks to solve even bigger problems. That ignorance is uncomfortable for many beginners, but it is both unavoidable and extremely useful. Even when you do know precisely how your components work, it is often extremely helpful to pretend that you don’t.

1.2 Simplify and Delegate

Recursionis a particularly powerful kind of reduction, which can be described loosely as follows:

• If the given instance of the problem can be solved directly, just solve it directly.

• Otherwise, reduce the instance to one or more simpler instances of the same problem.

If this self-reference is confusing, it’s helpful to imagine that someone else is going to solve the simpler problems, just as you would assume for other types of reductions. I like to call that someone else the Recursion Fairy. Your only task is to simplify the original problem, or to solve it directly when simplification is either unnecessary or impossible;

the Recursion Fairy will magically take care of all the simpler subproblems for you, using Methods That Are None Of Your Business So Butt Out.1 Mathematically sophisticated

1When I was a student, I used to attribute recursion to “elves” instead of the Recursion Fairy, referring to the Brothers Grimm story about an old shoemaker who leaves his work unfinished when he goes to bed, only to discover upon waking that elves (“Wichtelmänner”) have finished everything overnight. Someone

(3)

1.3. Tower of Hanoi

readers might recognize the Recursion Fairy by its more formal name: the Induction Hypothesis.

There is one mild technical condition that must be satisfied in order for any recursive method to work correctly: There must be no infinite sequence of reductions to simpler and simpler instances. Eventually, the recursive reductions must lead to an elementary base casethat can be solved by some other method; otherwise, the recursive algorithm will loop forever. The most common way to satisfy this condition is to reduce to one or more smaller instances of the same problem. For example, if the original input is a skreeble with n glurps, the input to each recursive call should be a skreeble with strictly less than n glurps. Of course this is impossible if the skreeble has no glurps at all—You can’t have negative glurps; that would be silly!—so in that case we must grindlebloff the skreeble using some other method.

We’ve already seen one instance of this pattern in the peasant multiplication algorithm, which is based directly on the following identity.

x· y =



0 if x = 0

bx/2c · ( y + y) if x is even bx/2c · ( y + y) + y if x is odd The same identity can be expressed algorithmically as follows:

Multiply(x, y):

if x = 0 return 0 else

x0← bx/2c y0← y + y

prod← Multiply(x0, y0) 〈〈Recurse!〉〉

if x is odd

prod← prod + y return prod

A lazy Egyptian scribe could execute this algorithm by computing x0and y’, asking a more junior scribe to multiply x0and y0, and then possibly adding y to the junior scribe’s response. The junior scribe’s problem is simpler because x0< x, and repeatedly reducing a positive integer eventually leads to 0. How the junior scribe actually computes x0· y0 is none of the senior scribe’s business (and it’s none of your business, either).

1.3 Tower of Hanoi

The Tower of Hanoi puzzle was first published—as an actual physical puzzle!—by the French recreational mathematician Éduoard Lucas in 1883, under the pseudonym

more entheogenically experienced than I might recognize them as Terence McKenna’s “self-transforming machine elves”.

(4)

“N. Claus (de Siam)” (an anagram of “Lucas d’Amiens”).2 The following year, Henri de Parville described the puzzle with the following remarkable story:3

In the great temple at Benares beneath the dome which marks the centre of the world, rests a brass plate in which are fixed three diamond needles, each a cubit high and as thick as the body of a bee. On one of these needles, at the creation, God placed sixty-four discs of pure gold, the largest disc resting on the brass plate, and the others getting smaller and smaller up to the top one. This is the Tower of Bramah. Day and night unceasingly the priests transfer the discs from one diamond needle to another according to the fixed and immutable laws of Bramah, which require that the priest on duty must not move more than one disc at a time and that he must place this disc on a needle so that there is no smaller disc below it. When the sixty-four discs shall have been thus transferred from the needle on which at the creation God placed them to one of the other needles, tower, temple, and Brahmins alike will crumble into dust, and with a thunderclap the world will vanish.

Figure 1.1. The Tower of Hanoi puzzle

Of course, as good computer scientists, our first instinct on reading this story is to substitute the variable n for the hardwired constant 64. And following standard practice (since most physical instances of the puzzle are made of wood instead of diamonds and gold), we will refer to the three possible locations for the disks as “pegs” instead of

“needles”. How can we move a tower of n disks from one peg to another, using a third peg as an occasional placeholder, without ever placing a disk on top of a smaller disk?

As Claus de Siam pointed out in the pamphlet included with his puzzle, the secret to solving this puzzle is to think recursively. Instead of trying to solve the entire puzzle all at once, let’s concentrate on moving just the largest disk. We can’t move it at the beginning, because all the other disks are covering it; we have to move those n − 1 disks to the third peg before we can move the largest disk. And then after we move the largest disk, we have to move those n − 1 disks back on top of it.

So now all we have to figure out is how to—

2Lucas later claimed to have invented the puzzle in 1876.

3This English translation is from W. W. Rouse Ball and H. S. M. Coxeter’s book Mathematical Recreations and Essays.

(5)

1.3. Tower of Hanoi

recursion

recursion

Figure 1.2. The Tower of Hanoi algorithm; ignore everything but the bottom disk.

STOP!! That’s it! We’re done! We’ve successfully reduced the n-disk Tower of Hanoi problem to two instances of the (n − 1)-disk Tower of Hanoi problem, which we can gleefully hand off to the Recursion Fairy—or to carry the original metaphor further, to the junior monks at the temple.

Our reduction does make one subtle but extremely important assumption: There is a largest disk. In other words, our recursive algorithm works for any n ≥ 1, but it breaks down when n = 0. We must handle that base using a different method. Fortunately, the monks at Benares, being good Buddhists, are quite adept at moving zero disks from one peg to another in no time at all, by doing nothing.

Figure 1.3. The vacuous base case for the Tower of Hanoi algorithm. There is no spoon.

While it’s tempting to think about how all those smaller disks move around—or more generally,what happens when the recursion is unrolled—it’s completely unnecessary. For even slightly more complicated algorithms, unrolling the recursion is far more confusing than illuminating. Our only task is to reduce the problem instance we’re given to one or more simpler instances, or to solve the problem directly if such a reduction is impossible.

Our algorithm is trivially correct when n = 0. For any n ≥ 1, the Recursion Fairy correctly moves the top n − 1 disks (more formally, the Inductive Hypothesis implies that our recursive algorithm correctly moves the top n − 1 disks) so our algorithm is correct.

The recursive Hanoi algorithm is expressed in pseudocode in Figure 1.4. The algorithm moves a stack of n disks from a source peg (src) to a destination peg (dst) using a third temporary peg (tmp) as a placeholder. Notice that the algorithm correctly does nothing at all when n = 0.

Let T(n) denote the number of moves required to transfer n disks—the running time of our algorithm. Our vacuous base case implies that T(0) = 0, and the more general recursive algorithm implies that T(n) = 2T(n − 1) + 1 for any n ≥ 1. By writing out the first several values of T(n), we can easily guess that T(n) = 2n− 1; a straightforward

(6)

Hanoi(n, src, dst, tmp):

if n > 0

Hanoi(n − 1, src, tmp, dst) 〈〈Recurse!〉〉

move disk n from src to dst

Hanoi(n − 1, tmp, dst, src) 〈〈Recurse!〉〉

Figure 1.4. A recursive algorithm to solve the Tower of Hanoi

induction proof implies that this guess is correct.In particular, moving a tower of 64 disks requires 264− 1 = 18,446,744,073,709,551,615 individual moves. Thus, even at the impressive rate of one move per second, the monks at Benares will be at work for approximately 585 billion years (“plus de cinq millards de siècles”) before tower, temple, and Brahmins alike will crumble into dust, and with a thunderclap the world will vanish.

1.4 Mergesort

Mergesort is one of the earliest algorithms proposed for sorting. According to Donald Knuth, it was proposed by John von Neumann as early as 1945.

1. Divide the input array into two subarrays of roughly equal size.

2. Recursively mergesort each of the subarrays.

3. Merge the newly-sorted subarrays into a single sorted array.

Input: S O R T I N G E X A M P L

Divide: S O R T I N G E X A M P L

Recurse: I N O S R T A E G L M P X

Merge: A E G I L M N O P R S T X

Figure 1.5. A mergesort example.

The first step is completely trivial—we only need to compute the median array index—and we can delegate the second step to the Recursion Fairy. All the real work is done in the final step; the two sorted subarrays can be merged using a simple linear-time algorithm. A complete description of the algorithm is given in Figure1.6; to keep the recursive structure clear, I’ve extracted the merge step into an independent subroutine.

Correctness

To prove that this algorithm is correct, we apply our old friend induction twice, first to the Merge subroutine then to the top-level Mergesort algorithm.

Lemma 1.1. Merge correctly merges the subarrays A[1 .. m] and A[m + 1 .. n], assuming those subarrays are sorted in the input.

Proof: Let A[1 .. n] be any array and m any integer such that the subarrays A[1 .. m] and A[m + 1 .. n] are sorted. We prove that for all k from 0 to n, the last n − k − 1 iterations

(7)

1.4. Mergesort

MergeSort(A[1 .. n]):

if n > 1 m← bn/2c

MergeSort(A[1 .. m]) MergeSort(A[m + 1 .. n]) Merge(A[1 .. n], m)

Merge(A[1 .. n], m):

i← 1; j ← m + 1 for k ← 1 to n

if j > n

B[k] ← A[i]; i ← i + 1 else if i > m

B[k] ← A[j]; j ← j + 1 else if A[i] < A[j]

B[k] ← A[i]; i ← i + 1 else

B[k] ← A[j]; j ← j + 1 for k ← 1 to n

A[k] ← B[k]

Figure 1.6. Mergesort

of the main loop correctly merge A[i .. m] and A[j .. n] into B[k .. n]. The proof proceeds by induction on n − k + 1, the number of elements remaining to be merged.

If k > n, the algorithm correctly merges the two empty subarrays by doing absolutely nothing. (This is the base case of the inductive proof.) Otherwise, there are four cases to consider for the kth iteration of the main loop.

• If j > n, subarray A[j .. n] is empty, so min A[i .. m] ∪ A[j .. n] = A[i].

• Otherwise, if i > m, subarray A[i .. m] is empty, so min A[i .. m] ∪ A[j .. n] = A[j].

• Otherwise, if A[i] < A[j], then min A[i .. m] ∪ A[j .. n] = A[i].

• Otherwise, we must have A[i] ≥ A[j], and thus min A[i .. m] ∪ A[j .. n] = A[j].

In all four cases, B[k] is correctly assigned the smallest element of A[i .. m] ∪ A[j .. n]. In the two cases with the assignment B[k] ← A[i], the Recursion Fairy correctly merges—

sorry, I mean the Induction Hypothesis implies that the last n − k iterations of the main loop correctly merge A[i + 1 .. m] and A[j .. n] into B[k + 1 .. n]. Similarly, in the other two cases, the Recursion Fairy correctly merges the rest of the subarrays. ƒ

Theorem 1.2. MergeSort correctly sorts any input array A[1 .. n].

Proof: We prove the theorem by induction on n. If n ≤ 1, the algorithm correctly does nothing. Otherwise, the Recursion Fairy correctly sorts—sorry, I mean the induction hypothesis implies that our algorithm correctly sorts—the two smaller subarrays A[1 .. m]

and A[m + 1 .. n], after which they are correctly Merged into a single sorted array (by

Lemma1.1). ƒ

Analysis

What’s the running time? Because the MergeSort algorithm is recursive, its running time is easily expressed by a recurrence. Merge clearly takes linear time, because it’s a

(8)

simple for-loop with constant work per iteration. We immediately obtain the following recurrence for MergeSort:

T(n) = T dn/2e + T bn/2c + O(n).

As in most divide-and-conquer recurrences, we can safely strip out the floors and ceilings using a domain transformation 〈〈explain?〉〉, giving us the simpler recurrence

ÂÂÂÂÂ

T(n) = 2T(n/2) + O(n).

The “all levels equal” case of the recursion tree method (described later in this chapter) immediately implies the closed-form solution T(n) = O(n log n). Even if you are not (yet) familiar with recursion trees, you can verify the solution T(n) = O(n log n) by induction.

1.5 Quicksort

Quicksort is another recursive sorting algorithm, discovered by Tony Hoare in 1962. In this algorithm, the hard work is splitting the array into subsets so that merging the final result is trivial.

1. Choose a pivot element from the array.

2. Partition the array into three subarrays containing the elements smaller than the pivot, the pivot element itself, and the elements larger than the pivot.

3. Recursively quicksort the first and last subarray.

Input: S O R T I N G E X A M P L

Choose a pivot: S O R T I N G E X A M P L Partition: A G O E I N L M P T X S R Recursse: A E G I L M N O P R S T X

Figure 1.7. A quicksort example.

A more detailed description of the algorithm is given in Figure1.8. In the separate Partition subroutine, the input parameter p is index of the pivot element in the unsorted array; the subroutine partitions the array and returns the new index of the pivot.

Correctness

Just like mergesort, proving QuickSort is correct requires two separate induction proofs:

one to prove that Partition correctly partitions the array, and the other to prove that QuickSort correctly sorts assuming Partition is correct. I’ll leave the tedious details as an exercise for the reader.

(9)

1.5. Quicksort

QuickSort(A[1 .. n]):

if (n > 1)

Choose a pivot element A[p]

r← Partition(A, p) QuickSort(A[1 .. r − 1]) QuickSort(A[r + 1 .. n])

Partition(A[1 .. n], p):

swap A[p] ↔ A[n]

i← 0 j← n while (i < j)

repeat i ← i + 1 until (i ≥ j or A[i] ≥ A[n]) repeat j ← j − 1 until (i ≥ j or A[j] ≤ A[n]) if (i < j)

swap A[i] ↔ A[j]

swap A[i] ↔ A[n]

return i Figure 1.8. Quicksort

Analysis

The analysis is also similar to mergesort. Partition runs in O(n) time: j − i = n at the beginning, j − i = 0 at the end, and we do a constant amount of work each time we increment i or decrement j. For QuickSort, we get a recurrence that depends on r, the rankof the chosen pivot element:

T(n) = T(r − 1) + T(n − r) + O(n)

If we could somehow choose the pivot to be the median element of the array A, we would have r = dn/2e, the two subproblems would be as close to the same size as possible, the recurrence would become

T(n) = 2T dn/2e − 1 + T bn/2c + O(n) ≤ 2T(n/2) + O(n), and we’d have T(n) = O(n log n) by the recursion tree method.

In fact, as we will see shortly, we can locate the median element in an unsorted array in linear time. However, the algorithm is fairly complicated, and the hidden constant in the O(·) notation is large enough to make the resulting sorting algorithm impractical. In practice, most programmers settle for something simple, like choosing the first or last element of the array. In this case, r take any value between 1 and n, so we have

T(n) = max

1≤r≤n T(r − 1) + T(n − r) + O(n).

In the worst case, the two subproblems are completely unbalanced—either r = 1 or r = n—and the recurrence becomes T(n) ≤ T(n − 1) + O(n). The solution is T(n) = O(n2).

Another common heuristic is called “median of three”—choose three elements (usually at the beginning, middle, and end of the array), and take the median of those three elements the pivot. Although this heuristic is somewhat more efficient in practice than just choosing one element, especially when the array is already (nearly) sorted, we can still have r = 2 or r = n−1 in the worst case. With the median-of-three heuristic, the recurrence becomes T(n) ≤ T(1)+ T(n−2)+O(n), whose solution is still T(n) = O(n2).

(10)

Intuitively, the pivot element should "usually" fall somewhere in the middle of the array, say between n/10 and 9n/10. This observation suggests that the average-case running time should be O(n log n). Although this intuition is actually correct (at least under the right formal assumptions), we are still far from a proof that quicksort is usually efficient. We will formalize this intuition about average-case behavior in a later chapter.

1.6 The Pattern

Both mergesort and and quicksort follow a general three-step pattern shared by all divide and conquer algorithms:

1. Divide the given instance of the problem into several independent smaller instances.

2. Delegate each smaller instance to the Recursion Fairy.

3. Combine the solutions for the smaller instances into the final solution for the given instance.

If the size of any subproblem falls below some constant threshold, the recursion bottoms out. Hopefully, at that point, the problem is trivial, but if not, we switch to a different algorithm instead.

Proving a divide-and-conquer algorithm correct almost always requires induction.

Analyzing the running time requires setting up and solving a recurrence, which usually (but unfortunately not always!) can be solved using recursion trees, perhaps after a simple domain transformation.

1.7 Recursion Trees

So what are these recursion trees I keep talking about? Imagine a divide-and-conquer algorithm that spends O(f (n)) time on non-recursive work, and then makes r recursive calls, each on a problem of size n/c. The running time of such an algorithm would be governed by the recurrence

T(n) = r T(n/c) + f (n)

Recursion trees are a simple, general, pictorial method for solving recurrences in this form, and in even more general forms. The root of the recursion tree is a box containing the value f (n); the root has a children, each of which is the root of a (recursively defined) recursion tree for the function T(n/c). Equivalently, a recursion tree is a complete a-ary tree where each node at depth i contains the value f (n/ci). In practice, I recommend only drawing out the first two or three levels of the tree.

The recursion stops when we get to the base case(s) of the recurrence. Because we’re only looking for asymptotic bounds, the precise base case doesn’t matter; we can safely assume that T(1) = O(1), or even that T(n) = O(1) for all n ≤ 10100. I’ll also assume for simplicity that n is an integral power of b (although this really doesn’t matter).

(11)

1.7. Recursion Trees

f(n/c)

f(n) r r

f(n/cL) f(n/c ) f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c )

f(n/c) r f(n/c ) f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c )

f(n/c) r f(n/c ) f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c )

f(n/c) r f(n/c ) f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c ) f(n/c f(n/c f(n/cf(n/cf(n/cf(n/c f(n/c f(n/c f(n/c f(n/c f(n/c )

f(n/c )) f(n/c )) f(n/c )) f(n/c ))

f(n/c) f(n/c) f(n/c) f(n/c)

f(n) f(n)

r f(n/c) r f(n/c )

rL f(n/cL) + + +

+ f(n/cL) f(n/cL) f(n/cL) f(n/cL) f(n/cL) f(n/cL) f(n/cL)

f(n/c )

f(n/cL) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/c )

f(n/cL) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/c )

f(n/cL) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/c )

f(n/cL) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/c )

f(n/cL) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/c )

f(n/cL) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cf(n/c )L) f(n/cLLL)) f(n/cLLL)) f(n/cLLL)) f(n/cLLL)) f(n/cLLL)) f(n/cLLL)) f(n/cLLL)) f(n/cLLL))

Figure 1.9. A recursion tree for the recurrence T(n) = r T(n/c) + f (n)

Now T(n) is just the sum of all values stored in the recursion tree; we can evaluate this sum by considering the tree level-by-level. For each i, the ith level of the tree contains ai nodes, each with value f (n/bi). Thus,

T(n) = XL i=0

rif(n/ci) (Σ)

where L is the depth of the recursion tree. We easily see that L = logcn, because n/cL = 1. The base case f (1) = Θ(1) implies that the last non-zero term in the summation is O(aL) = O(alogcr) = O(nlogcr).

There are three common cases where the level-by-level sum (Σ) is easy to evaluate:

• Decreasing: If the sum is a decreasing geometric series—every term is a constant factor smaller than the previous term—then the sum is dominated by the value at the root of the recursion tree: T(n) = O(f (n)).

• Equal: If all terms in the sum are equal, we immediately have T(n) = O(f (n) · L) = O(f (n) log n).

• Increasing: If the sum is an increasing geometric series—every term is a constant factor larger than the previous term—then the sum is dominated by the number of leaves in the recursion tree: T(n) = O(nlogcr).

In the first and third cases, only the largest term in the geometric series matters; all other terms are swallowed up by the O(·) notation. In the decreasing case, we don’t even have to compute L; the asymptotic bound would still hold if the recursion tree were infinite!

For example, if we draw out the first few levels of the recursion tree for our mergesort recurrence T(n) = 2T(n/2) + O(n), we discover that all levels are equal, which implies T(n) = O(n log n).

(12)

The recursion tree technique can also be used for algorithms where the recursive subproblems are not the same size. For example, the worst-case recurrence for quicksort T(n) = T(n − 1) + T(1) + O(n) gives us a completely unbalanced recursion tree, where one child of each internal node is a leaf. The level-by-level sum doesn’t fall into any of our three default categories, but we can still derive the solution T(n) = O(n2) by observing that every level value is at most n and there are at most n levels. (Moreover, since n/2 levels each have value at least n/2, this analysis can be improved by at most a constant factor, which for our purposes means not at all.)

n n/2

n/4 n/4

n/2

n/4 n/4

n n–1

n–2 1

1

n/8 n/8 n/8 n/8 n/8 n/8 n/8 n/8 n–3 1

Figure 1.10. The recursion trees for mergesort and quicksort

ª

1.8 Selection

So how do we find the median element of an array in linear time? The following algorithm was discovered by Manuel Blum, Bob Floyd, Vaughan Pratt, Ron Rivest, and Bob Tarjan in the early 1970s. Their algorithm actually solves the more general problem of selecting the kth largest element in an n-element array, given the array and the integer k as input, using a variant of an algorithm called either “quickselect” or “one-armed quicksort”.

The basic quickselect algorithm chooses a pivot element, partitions the array using the Partition subroutine from QuickSort, and then recursively searches only one of the two subarrays.

QuickSelect(A[1 .. n], k):

if n = 1

return A[1]

else

Choose a pivot element A[p]

r← Partition(A[1 .. n], p) if k < r

return QuickSelect(A[1 .. r − 1], k) else if k > r

return QuickSelect(A[r + 1 .. n], k − r) elsereturn A[r]

(13)

ª1.8. Selection

The worst-case running time of QuickSelect obeys a recurrence similar to the QuickSort recurrence. We don’t know the value of r or which subarray we’ll recursively search, so we’ll just assume the worst.

T(n) ≤ max

1≤r≤n max{T (r − 1), T (n − r)} + O(n)

We can simplify the recurrence by using ` to denote the length of the recursive subproblem:

T(n) ≤ max

0≤`≤n−1T(`) + O(n)

If the chosen pivot element is always either the smallest or largest element in the array, the recurrence simplifies to T(n) = T(n − 1) + O(n), which implies T(n) = O(n2). (The recursion tree for this recurrence is just a simple path.)

We could avoid this quadratic worst-case behavior if we could somehow magically choose a good pivot, meaning ` ≤ αn for some constant α < 1. In this case, the recurrence would simplify to

T(n) ≤ T(αn) + O(n).

This recurrence expands into a descending geometric series, which is dominated by its largest term, so T(n) = O(n). (Again, the recursion tree is just a simple path. The constant in the O(n) running time depends on the constant α.)

The Blum-Floyd-Pratt-Rivest-Tarjan algorithm chooses a good pivot for one-armed quicksort by recursively computing the median of a carefully-selected subset of the input array. Specifically, we divide the input array into dn/5e blocks, each containing exactly 5 elements, except possibly the last. (If the last block isn’t full, just throw in a few ∞s.) We compute the median of each block by brute force, collect those medians into a new array M[1 .. dn/5e], and then recursively compute the median of this new array. Finally we use the median of medians (called “mom” in the following pseudocode) as the pivot in one-armed quicksort.

MomSelect(A[1 .. n], k):

if n ≤ 25 〈〈or whatever〉〉

use brute force else

m← dn/5e for i ← 1 to m

M[i] ← MedianOfFive(A[5i − 4 .. 5i])〈〈Brute force!〉〉

mom← MomSelect(M[1 .. m], bm/2c) 〈〈Recursion!〉〉

r← Partition(A[1 .. n], mom) if k < r

return MomSelect(A[1 .. r − 1], k) 〈〈Recursion!〉〉

else if k > r

return MomSelect(A[r + 1 .. n], k − r) 〈〈Recursion!〉〉

elsereturn mom

(14)

The first key insight is that the median of medians is in fact a good pivot. The median of medians is larger thandn/5e/2 − 1 ≈ n/10 block medians, and each block median is larger than two other elements in its block. Thus, mom is larger than at least 3n/10 elements in the input array; symmetrically, mom is smaller than at least 3n/10 input elements. Thus, in the worst case, the last recursive call searches an array of size at most 7n/10.

We can visualize the algorithm’s behavior by drawing the input array as a 5 × dn/5e grid, which each column represents five consecutive elements. For purposes of illustration, imagine that we sort every column from top down, and then we sort the columns by their middle element. (Let me emphasize that the algorithm does not actually do this!) In this arrangement, the median-of-medians is the element closest to the center of the grid.

Figure 1.11. Visualizing the median of medians

The left half of the first three rows of the grid contains 3n/10 elements, each of which is smaller than the median-of-medians. If the element we’re looking for is larger than the median-of-medians, our algorithm will throw away everything smaller than the median-of-median, including those 3n/10 elements, before recursing. Thus, the input to the recursive subproblem contains at most 7n/10 elements. A symmetric argument applies when our target element is smaller than the median-of-medians.

Figure 1.12. Discarding approximately 3/10 of the array

Okay, so mom is a good pivot, but now the algorithm is making two recursive calls instead of just one; how do we know the resulting running time is still linear? The second key insight is that the total size of the two recursive subproblems is a constant factor smaller than the size of the original input array. The worst-case running time of the algorithm obeys the recurrence

T(n) ≤ O(n) + T(n/5) + T(7n/10).

(15)

ª1.8. Selection

If we draw out the recursion tree for this recursion, we observe that the total work at each level of the recursion tree is at most 9/10 the total work at the previous level. Thus, the level-by-level sum is a descending geometric series, giving us the solution T(n) = O(n).

n

7n/10 49n/100 7n/50

n/5 7n/50 n/25

n

2n/3 4n/9 2n/9

n/3 2n/9 n/9

Figure 1.13. The recursion trees forMomSelect and for a similar selection algorithm with blocks of size 3

Where did the magic constant 5 come from? That’s the smallest odd block size that gives us a descending geometric series in the running time! (Even block sizes introduce additional complications.) If we had used blocks of 3 elements instead, the running-time recurrence would have been

T(n) ≤ O(n) + T(n/3) + T(2n/3).

In this case, every level of the recursion tree has the same value n. The leaves of the recursion tree are not all at the same level, but for purposes of deriving an upper bound, it suffices to observe that the deepest leaf is at level log3/2n, so the total work in the tree is at most O(n log3/2n) = O(n log n). So this algorithm is no faster than sorting!

Finer analysis reveals that the constant hidden by the O( ) notation is quite large, even if we count only comparisons. Selecting the median of 5 elements requiresat most 6comparisons, so we need at most 6n/5 comparisons to set up the recursive subproblem.

Naïvely partitioning the array after the recursive call would require n − 1 comparisons, but we already know 3n/10 elements larger than the pivot and 3n/10 elements smaller than the pivot, so partitioning actually requires only 2n/5 additional comparisons. Thus, a more precise recurrence for the worst-case number of comparisons is

T(n) ≤ 8n/5 + T(n/5) + T(7n/10).

The recursion tree method implies the upper bound

T(n) ≤ 8n 5

X

i≥0

 9 10

‹i

= 8n

5 · 10 = 16n.

In practice, this algorithm isn’t as horrible as this worst-case analysis predicts—getting a worst-case partition at every level of recursion is incredibly unlikely—but it is slower than sorting for even moderately large arrays.

(16)

1.9 Multiplication

In Chapter 0, we saw two different algorithms for multiplying two n-digit numbers in O(n2) time: the grade-school lattice algorithm and the Egyptian peasant algorithm.

Perhaps we can get a more efficient algorithm by splitting the numbers in half, and exploiting the following identity:

(10ma+ b)(10mc+ d) = 102mac+ 10m(bc + ad) + bd

Here is a divide-and-conquer algorithm that computes the product of two n-digit numbers x and y, based on this formula. Each of the four sub-products e, f , g, h is computed recursively. The last line does not involve any multiplications, however; to multiply by a power of ten, we just shift the digits and fill in the right number of zeros.

Multiply(x, y, n):

if n = 1

return x · y else

m← dn/2e

a← bx/10mc; b ← x mod 10m 〈〈x = 10ma+ b〉〉

c← b y/10mc; d ← y mod 10m 〈〈 y = 10mc+ d〉〉

e← Multiply(a, c, m) f ← Multiply(b, d, m) g← Multiply(b, c, m) h← Multiply(a, d, m) return 102me+ 10m(g + h) + f

Correctness of this algorithm follows easily by induction. The running time for this algorithm is given by the recurrence

T(n) = 4T(dn/2e) + O(n).

(We can safely ignore the ceiling in the recursive argument.) The recursion tree method transforms this recurrence into an increasing geometric series, which implies T(n) = O(nlog24) = O(n2). Hmm. . . I guess this didn’t help after all.

n n/2

n/4 n/4 n/4 n/4

n/2 n/4 n/4 n/4 n/4

n/2 n/4 n/4 n/4 n/4

n/2 n/4 n/4 n/4 n/4

Figure 1.14. The recursion tree for naive divide-and-conquer multiplication

In the mid-1950s, Andrei Kolmogorov, one of the giants of 20th century mathematics, publicly conjectured that there is no algorithm to multiply two n-digit numbers in o(n2)

(17)

1.9. Multiplication

time. Kolmogorov organized a seminar at Moscow University in 1960, where he restated his “n2 conjecture” and posed several related problems that he planned to discuss at future meetings. Almost exactly one week later, 23-year-old student Anatoli˘ı Karatsuba presented Kolmogorov with a remarkable counterexample. According to Karastuba himself,

After the seminar I told Kolmogorov about the new algorithm and about the disproof of then2conjecture. Kolmogorov was very agitated because this contradicted his very plausible conjecture. At the next meeting of the seminar, Kolmogorov himself told the participants about my method, and at that point the seminar was terminated.

Karastuba observed that the middle coefficient bc + ad can be computed from the other two coefficients ac and bd using only one more recursive multiplication, via the following algebraic identity:

ac+ bd − (a − b)(c − d) = bc + ad

This trick lets us replace the four recursive calls in the previous algorithm with just three recursive calls, as shown below:

FastMultiply(x, y, n):

if n = 1

return x · y else

m← dn/2e

a← bx/10mc; b ← x mod 10m 〈〈x = 10ma+ b〉〉

c← b y/10mc; d ← y mod 10m 〈〈 y = 10mc+ d〉〉

e← FastMultiply(a, c, m) f ← FastMultiply(b, d, m) g← FastMultiply(a − b, c − d, m) return 102me+ 10m(e + f − g) + f

The running time of Karatsuba’s FastMultiply algorithm is given by the recurrence T(n) ≤ 3T(dn/2e) + O(n)

Again, we can safely ignore the ceiling in the recursive argument, and the recursion tree method transforms the recurrence into an increasing geometric series, but the new solution is only T(n) = O(nlog23) = O(n1.58496), a significant improvement over our earlier quadratic-time algorithm.4 Karatsuba’s algorithm arguably launched the design and analysis of algorithms as a formal field of study.

Of course, in practice, all this is done in binary (or perhaps base 256 or 65536) instead of decimal.

4My presentation simplifies the actual history slightly. In fact, Karatsuba proposed an algorithm based on the formula (a + c)(b + d) − ac − bd = bc + ad. This algorithm also runs in O(nlg 3) time, but the actual recurrence is slightly messier: a − b and c − d are still m-digit numbers, but a + b and c + d might each have m + 1 digits. The simplification presented here is due to Donald Knuth.

(18)

We can take this idea even further, splitting the numbers into more pieces and com- bining them in more complicated ways, to obtain even faster multiplication algorithms.

Andrei Toom and Stephen Cook discovered an infinite family of algorithms that split any integer into k parts, each with n/k digits, and then compute the product using only 2k − 1 recursive multiplications. For any fixed k, the resulting algorithm runs in O(n1+1/(lg k)) time, where the hidden constant in the O(·) notation depends on k.

Ultimately, this divide-and-conquer strategy led Gauss (yes, really) to the discovery of the Fast Fourier transform, which is described in detail in a later chapter. The fastest multiplication algorithm known, published by Martin Fürer in 2007 and based on FFTs, runs in O(n log n2O(logn)) time. Here, logndenotes the slowly growing iterated logarithmof n, which is the number of times one must take the logarithm of n before the value is less than 1:

lgn=

¨1 if n ≤ 2,

1+ lg(lg n) otherwise.

For all practical purposes, logn≤ 6. It is widely conjectured that the best possible algorithm for multiply two n-digit numbers runs in Θ(n log n) time.

1.10 Exponentiation

Given a number a and a positive integer n, suppose we want to compute an. The standard naïve method is a simple for-loop that does n − 1 multiplications by a:

SlowPower(a, n):

x← a for i ← 2 to n

x← x · a return x This iterative algorithm requires n multiplications.

Notice that the input a could be an integer, or a rational, or a floating point number.

In fact, it doesn’t need to be a number at all, as long as it’s something that we know how to multiply. For example, the same algorithm can be used to compute powers modulo some finite number (an operation commonly used in cryptography algorithms) or to compute powers of matrices (an operation used to evaluate recurrences and to compute shortest paths in graphs). Since we don’t know what kind of things we’re multiplying, we can’t know how long a multiplication takes, so we’re forced analyze the running time in terms of the number of multiplications.

There is a much faster divide-and-conquer method, using the following simple recursive formula:

an= abn/2c· adn/2e.

What makes this approach more efficient is that once we compute the first factor abn/2c, we can compute the second factor adn/2eusing at most one more multiplication.

(19)

Exercises

FastPower(a, n):

if n = 1 return a else

x← FastPower(a, bn/2c) if n is even

return x · x elsereturn x · x · a

The total number of multiplications satisfies the recurrence T(n) ≤ T(n/2) + 2. The recursion-tree method immediately give us the solution T(n) = O(log n).

Incidentally, this algorithm is asymptotically optimal—any algorithm for computing an must perform at least Ω(log n) multiplications. In fact, when n is a power of two, this algorithm is exactly optimal. However, there are slightly faster methods for other values of n. For example, our divide-and-conquer algorithm computes a15in six multiplications (a15= a7· a7· a; a7= a3· a3· a; a3= a · a · a), but only five multiplications are necessary (a → a2→ a3→ a5→ a10→ a15). It is an open question whether the absolute minimum number of multiplications for a given exponent n can be computed efficiently.

Exercises

Tower of Hanoi

1. Prove that the original recursive Tower of Hanoi algorithm performs exactly the same sequence of moves—the same disks, to and from the same pegs, in the same order—as each of the following non-recursive algorithms. The pegs are labeled 0, 1, and 2, and our problem is to move a stack of n disks from peg 0 to peg 2 (as shown on page4).

(a) If n is even, swap pegs 1 and 2. At the ith step, make the only legal move that Homework

avoids peg i mod 3. If there is no legal move, then all disks are on peg i mod 3, and the puzzle is solved.

(b) For the first move, move disk 1 to peg 1 if n is even and to peg 2 if n is odd. Then Homework

repeatedly make the only legal move that does not undo the previous move. If no such move exists, the puzzle is solved.

(c) Pretend that disks n + 1, n + 2, and n + 3 are at the bottom of pegs 0, 1, and 2, Homework

respectively. Repeatedly make the only legal move that satisfies the following constraints, until no such move is possible.

• Do not place an odd disk directly on top of another odd disk.

• Do not place an even disk directly on top of another even disk.

• Do not undo the previous move.

(d) Let ρ(n) denote the smallest integer k such that n/2k is not an integer. Homework

(20)

Hanoi(n):

i← 1

while ρ(i) ≤ n if n − i is even

move disk ρ(i) forward 〈〈0 → 1 → 2 → 0〉〉

elsemove disk ρ(i) backward 〈〈0 → 2 → 1 → 0〉〉

i← i + 1

For example, ρ(42) = 2, because 42/21is an integer but 42/22is not. (Equiva- lently, ρ(n) is one more than the position of the least significant 1 in the binary representation of n.) Because its behavior resembles the marks on a ruler, ρ(n) is sometimes called the ruler function.

2. The Tower of Hanoi is a relatively recent descendant of a much older mechanical

Homework

puzzle known as the Chinese linked rings, Baguenaudier (a French word meaning “to wander about aimlessly”), Meleda, Patience, Tiring Irons, Prisoner’s Lock, Spin-Out, and many other names. This puzzle was already well known in both China and Europe by the 16th century. The Italian mathematician Luca Pacioli described the 7-ring puzzle and its solution in his unpublished treatise De Viribus Quantitatis, written between 1498 and 1506;5only a few years later, the Ming-dynasty poet Yang Shen described the 9-ring puzzle as “a toy for women and children”. The puzzle is apocryphally attributed to a 2nd-century Chinese general, who gave the puzzle to his wife to occupy her time while he was away at war.

Figure 1.15. The 7-ring Baguenaudier, from Récréations Mathématiques by Édouard Lucas (1891)

The Baguenaudier puzzle has many physical forms, but it typically consists of a long metal loop and several rings, which are connected to a solid base by movable rods. The loop is initially threaded through the rings as shown in the figure above;

the goal of the puzzle is to remove the loop.

More abstractly, we can model the puzzle as a sequence of bits, one for each ring, where the ith bit is1if the loop passes through the ith ring and0otherwise. (Here

5De Viribus Quantitatis[On the Powers of Numbers] is an important early work on recreational mathematics and perhaps the oldest surviving treatise on magic. Pacioli is better known for Summa de Aritmetica, a near-complete encyclopedia of late 15th-century mathematics, which included the first description of double-entry bookkeeping.

(21)

Exercises

we index the rings from right to left, as shown in the figure.) The puzzle allows two legal moves:

• You can always flip the 1st (= rightmost) bit.

• If the bit string ends with exactly i0s, you can flip the (i + 2)th bit.

The goal of the puzzle is to transform a string of n 1s into a string of n 0s. For example, the following sequence of 21 moves solves the 5-ring puzzle:

111111 111103 110101 110112 110011 110005 01000

1 010012 010111 010103 011101 011112 011011 011004 00100

1 001012 001111 001103 000101 000112 000011 00000

©(a) Call a sequence of moves reduced if no move is the inverse of the previous move.

Prove that for any non-negative integer n, there is exactly one reduced sequence of moves that solves the n-ring Baguenaudier puzzle. [Hint: This problem is much easier if you’re already familiar with graphs.]

(b) Describe an algorithm to solve the Baguenaudier puzzle. Your input is the number of rings n; your algorithm should print a reduced sequence of moves that solves the puzzle. For example, given the integer 5 as input, your algorithm should print the sequence 1, 3, 1, 2, 1, 5, 1, 2, 1, 3, 1, 2, 1, 4, 1, 2, 1, 3, 1, 2, 1.

(c) Exactly how many moves does your algorithm perform, as a function of n? Prove your answer is correct.

3. A less familiar chapter in the Tower of Hanoi’s history is its brief relocation of the Homework

temple from Benares to Pisa in the early 13th century. The relocation was organized by the wealthy merchant-mathematician Leonardo Fibonacci, at the request of the Holy Roman Emperor Frederick II, who had heard reports of the temple from soldiers returning from the Crusades. The Towers of Pisa and their attendant monks became famous, helping to establish Pisa as a dominant trading center on the Italian peninsula.

Unfortunately, almost as soon as the temple was moved, one of the diamond needles began to lean to one side. To avoid the possibility of the leaning tower falling over from too much use, Fibonacci convinced the priests to adopt a more relaxed rule: Any number of disks on the leaning needle can be moved together to another needle in a single move. It was still forbidden to place a larger disk on top of a smaller disk, and disks had to be moved one at a time onto the leaning needle or between the two vertical needles.

Thanks to Fibonacci’s new rule, the priests could bring about the end of the universe somewhat faster from Pisa then they could than could from Benares.

Fortunately, the temple was moved from Pisa back to Benares after the newly

(22)

Figure 1.16. The Towers of Pisa. In the fifth move, two disks are taken off the leaning needle.

crowned Pope Gregory IX excommunicated Frederick II, making the local priests less sympathetic to hosting foreign heretics with strange mathematical habits. Soon afterward, a bell tower was erected on the spot where the temple once stood; it too began to lean almost immediately.

Describe an algorithm to transfer a stack of n disks from one vertical needle to the other vertical needle, using the smallest possible number of moves. Exactly how many moves does your algorithm perform?

4. Consider the following restricted variants of the Tower of Hanoi puzzle. In each problem, the pegs are numbered 0, 1, and 2, as in problem 1, and your task is to move a stack of n disks from peg 1 to peg 2.

(a) Suppose you are forbidden to move any disk directly between peg 1 and peg 2;

Exam

everymove must involve peg 0. Describe an algorithm to solve this version of the puzzle in as few moves as possible. Exactly how many moves does your algorithm make?

ª(b) Suppose you are only allowed to move disks from peg 0 to peg 2, from peg 2 to

Homework

peg 1, or from peg 1 to peg 0. Equivalently, suppose the pegs are arranged in a circle and numbered in clockwise order, and you are only allowed to move disks counterclockwise. Describe an algorithm to solve this version of the puzzle in as few moves as possible. How many moves does your algorithm make? [Hint: See the chapter on solving recurrences in the appendix.]

¨ª(c) Finally, suppose your only restriction is that you may never move a disk directly

Fun Homework

from peg 1 to peg 2. Describe an algorithm to solve this version of the puzzle in as few moves as possible. How many moves does your algorithm make? [Hint:

This variant is considerably harder to analyze than the other two.]

5. A German mathematician developed a new variant of the Towers of Hanoi puzzle, known in the US literature as the “Liberty Towers” puzzle.6 In this variant, there is a row of k pegs, numbered from 1 to k. In a single turn, you are allowed to move the

6No it isn’t.

參考文獻

相關文件

introduction to continuum and matrix model formulation of non-critical string theory.. They typically describe strings in 1+0 or 1+1 dimensions with a

• An algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output.. • An algorithm is

A Boolean function described by an algebraic expression consists of binary variables, the constant 0 and 1, and the logic operation symbols.. For a given value of the binary

Establish the start node of the state graph as the root of the search tree and record its heuristic value.. while (the goal node has not

The well-known halting problem (a decision problem), which is to determine whether or not an algorithm will terminate with a given input, is NP-hard, but

• A sequence of numbers between 1 and d results in a walk on the graph if given the starting node.. – E.g., (1, 3, 2, 2, 1, 3) from

• Given a (singly) linked list of unknown length, design an algorithm to find the n-th node from the tail of the linked list. Your algorithm is allowed to traverse the linked

When an algorithm contains a recursive call to itself, we can often describe its running time by a recurrenceequation or recurrence, which describes the overall running time on