The on-line hiring problem - Randomized algorithms

Third Edition

5.3 Randomized algorithms

5.4.4 The on-line hiring problem

As a ﬁnal example, we consider a variant of the hiring problem. Suppose now that we do not wish to interview all the candidates in order to ﬁnd the best one. We also do not wish to hire and ﬁre as we ﬁnd better and better applicants. Instead, we are willing to settle for a candidate who is close to the best, in exchange for hiring exactly once. We must obey one company requirement: after each interview we must either immediately offer the position to the applicant or immediately reject the applicant. What is the trade-off between minimizing the amount of interviewing and maximizing the quality of the candidate hired?

We can model this problem in the following way. After meeting an applicant, we are able to give each one a score; let score.i / denote the score we give to the i th applicant, and assume that no two applicants receive the same score. After we have seen j applicants, we know which of the j has the highest score, but we do not know whether any of the remaining nj applicants will receive a higher score. We decide to adopt the strategy of selecting a positive integer k < n, interviewing and then rejecting the ﬁrst k applicants, and hiring the ﬁrst applicant thereafter who has a higher score than all preceding applicants. If it turns out that the best-qualiﬁed applicant was among the ﬁrst k interviewed, then we hire the nth applicant. We formalize this strategy in the procedure ON-LINE-MAXIMUM.k; n/, which returns the index of the candidate we wish to hire.

ON-LINE-MAXIMUM.k; n/

1 bestscore D 1 2 fori D 1 to k

3 if score.i / > bestscore 4 bestscore D score.i / 5 fori D k C 1 to n

6 if score.i / > bestscore

7 returni

8 returnn

We wish to determine, for each possible value of k, the probability that we hire the most qualiﬁed applicant. We then choose the best possible k, and implement the strategy with that value. For the moment, assume that k is ﬁxed. Let M.j / D max1i jfscore.i /g denote the maximum score among ap-plicants 1 through j . Let S be the event that we succeed in choosing the best-qualiﬁed applicant, and let Sibe the event that we succeed when the best-qualiﬁed applicant is the i th one interviewed. Since the various Si are disjoint, we have that PrfS g DPn

i D1PrfSig. Noting that we never succeed when the best-qualiﬁed applicant is one of the ﬁrst k, we have that PrfSig D 0 for i D 1; 2; : : : ; k. Thus, we obtain

PrfS g D Xn i DkC1

PrfS_ig : (5.12)

We now compute PrfSig. In order to succeed when the best-qualiﬁed applicant is the i th one, two things must happen. First, the best-qualiﬁed applicant must be in position i , an event which we denote by Bi. Second, the algorithm must not select any of the applicants in positions k C 1 through i 1, which happens only if, for each j such that k C 1 j i 1, we ﬁnd that score.j / < bestscore in line 6.

(Because scores are unique, we can ignore the possibility of score.j / D bestscore.) In other words, all of the values score.k C 1/ through score.i 1/ must be less than M.k/; if any are greater than M.k/, we instead return the index of the ﬁrst one that is greater. We use Oi to denote the event that none of the applicants in position k C 1 through i 1 are chosen. Fortunately, the two events Bi and Oi

are independent. The event Oi depends only on the relative ordering of the values in positions 1 through i 1, whereas Bi depends only on whether the value in position i is greater than the values in all other positions. The ordering of the values in positions 1 through i 1 does not affect whether the value in position i is greater than all of them, and the value in position i does not affect the ordering of the values in positions 1 through i 1. Thus we can apply equation (C.15) to obtain

PrfSig D Pr fBi\ Oig D Pr fBig Pr fOig :

The probability PrfBig is clearly 1=n, since the maximum is equally likely to be in any one of the n positions. For event Oi to occur, the maximum value in positions 1 through i 1, which is equally likely to be in any of these i 1 positions, must be in one of the ﬁrst k positions. Consequently, PrfOig D k=.i 1/ and

We approximate by integrals to bound this summation from above and below. By the inequalities (A.12), we have

Evaluating these deﬁnite integrals gives us the bounds k

n.ln n ln k/ Pr fS g k

n.ln.n 1/ ln.k 1// ;

which provide a rather tight bound for PrfS g. Because we wish to maximize our probability of success, let us focus on choosing the value of k that maximizes the lower bound on PrfS g. (Besides, the lower-bound expression is easier to maximize than the upper-bound expression.) Differentiating the expression .k=n/.ln nln k/

with respect to k, we obtain 1

n.ln n ln k 1/ :

Setting this derivative equal to 0, we see that we maximize the lower bound on the probability when ln k D ln n 1 D ln.n=e/ or, equivalently, when k D n=e. Thus, if we implement our strategy with k D n=e, we succeed in hiring our best-qualiﬁed applicant with probability at least 1=e.

Exercises

5.4-1

How many people must there be in a room before the probability that someone has the same birthday as you do is at least 1=2? How many people must there be before the probability that at least two people have a birthday on July 4 is greater than 1=2?

5.4-2

Suppose that we toss balls into b bins until some bin contains two balls. Each toss is independent, and each ball is equally likely to end up in any bin. What is the expected number of ball tosses?

5.4-3 ?

For the analysis of the birthday paradox, is it important that the birthdays be mutu-ally independent, or is pairwise independence sufﬁcient? Justify your answer.

5.4-4 ?

How many people should be invited to a party in order to make it likely that there are three people with the same birthday?

5.4-5 ?

What is the probability that a k-string over a set of size n forms a k-permutation?

How does this question relate to the birthday paradox?

5.4-6 ?

Suppose that n balls are tossed into n bins, where each toss is independent and the ball is equally likely to end up in any bin. What is the expected number of empty bins? What is the expected number of bins with exactly one ball?

5.4-7 ?

Sharpen the lower bound on streak length by showing that in n ﬂips of a fair coin, the probability is less than 1=n that no streak longer than lg n2 lg lg n consecutive heads occurs.

Problems

5-1 Probabilistic counting

With a b-bit counter, we can ordinarily only count up to 2^b 1. With R. Morris’s probabilistic counting, we can count up to a much larger value at the expense of some loss of precision.

We let a counter value of i represent a count of nifor i D 0; 1; : : : ; 2^b 1, where the ni form an increasing sequence of nonnegative values. We assume that the ini-tial value of the counter is 0, representing a count of n0 D 0. The INCREMENT

operation works on a counter containing the value i in a probabilistic manner. If i D 2^b 1, then the operation reports an overﬂow error. Otherwise, the INCRE

-MENT operation increases the counter by 1 with probability 1=.ni C1 ni/, and it leaves the counter unchanged with probability 1 1=.ni C1 ni/.

If we select ni D i for all i 0, then the counter is an ordinary one. More interesting situations arise if we select, say, ni D 2^{i 1} for i > 0 or ni D Fi (the i th Fibonacci number—see Section 3.2).

For this problem, assume that n₂b1 is large enough that the probability of an overﬂow error is negligible.

a. Show that the expected value represented by the counter after n INCREMENT

operations have been performed is exactly n.

b. The analysis of the variance of the count represented by the counter depends on the sequence of the ni. Let us consider a simple case: ni D 100i for all i 0. Estimate the variance in the value represented by the register after n INCREMENToperations have been performed.

5-2 Searching an unsorted array

This problem examines three algorithms for searching for a value x in an unsorted array A consisting of n elements.

Consider the following randomized strategy: pick a random index i into A. If AŒi D x, then we terminate; otherwise, we continue the search by picking a new random index into A. We continue picking random indices into A until we ﬁnd an index j such that AŒj D x or until we have checked every element of A. Note that we pick from the whole set of indices each time, so that we may examine a given element more than once.

a. Write pseudocode for a procedure RANDOM-SEARCH to implement the strat-egy above. Be sure that your algorithm terminates when all indices into A have been picked.

b. Suppose that there is exactly one index i such that AŒi D x. What is the expected number of indices into A that we must pick before we ﬁnd x and RANDOM-SEARCHterminates?

c. Generalizing your solution to part (b), suppose that there are k 1 indices i such that AŒi D x. What is the expected number of indices into A that we must pick before we ﬁnd x and RANDOM-SEARCH terminates? Your answer should be a function of n and k.

d. Suppose that there are no indices i such that AŒi D x. What is the expected number of indices into A that we must pick before we have checked all elements of A and RANDOM-SEARCH terminates?

Now consider a deterministic linear search algorithm, which we refer to as DETERMINISTIC-SEARCH. Speciﬁcally, the algorithm searches A for x in order, considering AŒ1; AŒ2; AŒ3; : : : ; AŒn until either it ﬁnds AŒi D x or it reaches the end of the array. Assume that all possible permutations of the input array are equally likely.

e. Suppose that there is exactly one index i such that AŒi D x. What is the average-case running time of DETERMINISTIC-SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH?

f. Generalizing your solution to part (e), suppose that there are k 1 indices i such that AŒi D x. What is the average-case running time of DETERMINISTIC -SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH? Your answer should be a function of n and k.

g. Suppose that there are no indices i such that AŒi D x. What is the average-case running time of DETERMINISTIC-SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH?

Finally, consider a randomized algorithm SCRAMBLE-SEARCH that works by ﬁrst randomly permuting the input array and then running the deterministic lin-ear slin-earch given above on the resulting permuted array.

h. Letting k be the number of indices i such that AŒi D x, give the worst-case and expected running times of SCRAMBLE-SEARCH for the cases in which k D 0 and k D 1. Generalize your solution to handle the case in which k 1.

i. Which of the three searching algorithms would you use? Explain your answer.

Chapter notes

Bollob´as [53], Hofri [174], and Spencer [321] contain a wealth of advanced prob-abilistic techniques. The advantages of randomized algorithms are discussed and surveyed by Karp [200] and Rabin [288]. The textbook by Motwani and Raghavan [262] gives an extensive treatment of randomized algorithms.

Several variants of the hiring problem have been widely studied. These problems are more commonly referred to as “secretary problems.” An example of work in this area is the paper by Ajtai, Meggido, and Waarts [11].

This part presents several algorithms that solve the following sorting problem:

Input: A sequence of n numbers ha1; a2; : : : ; ani.

Output: A permutation (reordering) ha⁰₁; a₂⁰; : : : ; a_n⁰i of the input sequence such that a⁰₁ a⁰₂ a_n⁰.

The input sequence is usually an n-element array, although it may be represented in some other fashion, such as a linked list.

The structure of the data

In practice, the numbers to be sorted are rarely isolated values. Each is usually part of a collection of data called a record. Each record contains a key, which is the value to be sorted. The remainder of the record consists of satellite data, which are usually carried around with the key. In practice, when a sorting algorithm permutes the keys, it must permute the satellite data as well. If each record includes a large amount of satellite data, we often permute an array of pointers to the records rather than the records themselves in order to minimize data movement.

In a sense, it is these implementation details that distinguish an algorithm from a full-blown program. A sorting algorithm describes the method by which we determine the sorted order, regardless of whether we are sorting individual numbers or large records containing many bytes of satellite data. Thus, when focusing on the problem of sorting, we typically assume that the input consists only of numbers.

Translating an algorithm for sorting numbers into a program for sorting records

is conceptually straightforward, although in a given engineering situation other subtleties may make the actual programming task a challenge.

Why sorting?

Many computer scientists consider sorting to be the most fundamental problem in the study of algorithms. There are several reasons:

Sometimes an application inherently needs to sort information. For example, in order to prepare customer statements, banks need to sort checks by check number.

Algorithms often use sorting as a key subroutine. For example, a program that renders graphical objects which are layered on top of each other might have to sort the objects according to an “above” relation so that it can draw these objects from bottom to top. We shall see numerous algorithms in this text that use sorting as a subroutine.

We can draw from among a wide variety of sorting algorithms, and they em-ploy a rich set of techniques. In fact, many important techniques used through-out algorithm design appear in the body of sorting algorithms that have been developed over the years. In this way, sorting is also a problem of historical interest.

We can prove a nontrivial lower bound for sorting (as we shall do in Chapter 8).

Our best upper bounds match the lower bound asymptotically, and so we know that our sorting algorithms are asymptotically optimal. Moreover, we can use the lower bound for sorting to prove lower bounds for certain other problems.

Many engineering issues come to the fore when implementing sorting algo-rithms. The fastest sorting program for a particular situation may depend on many factors, such as prior knowledge about the keys and satellite data, the memory hierarchy (caches and virtual memory) of the host computer, and the software environment. Many of these issues are best dealt with at the algorith-mic level, rather than by “tweaking” the code.

Sorting algorithms

We introduced two algorithms that sort n real numbers in Chapter 2. Insertion sort takes ‚.n²/ time in the worst case. Because its inner loops are tight, however, it is a fast in-place sorting algorithm for small input sizes. (Recall that a sorting algorithm sorts in place if only a constant number of elements of the input ar-ray are ever stored outside the arar-ray.) Merge sort has a better asymptotic running time, ‚.n lg n/, but the MERGEprocedure it uses does not operate in place.

In this part, we shall introduce two more algorithms that sort arbitrary real num-bers. Heapsort, presented in Chapter 6, sorts n numbers in place in O.n lg n/ time.

It uses an important data structure, called a heap, with which we can also imple-ment a priority queue.

Quicksort, in Chapter 7, also sorts n numbers in place, but its worst-case running time is ‚.n²/. Its expected running time is ‚.n lg n/, however, and it generally outperforms heapsort in practice. Like insertion sort, quicksort has tight code, and so the hidden constant factor in its running time is small. It is a popular algorithm for sorting large input arrays.

Insertion sort, merge sort, heapsort, and quicksort are all comparison sorts: they determine the sorted order of an input array by comparing elements. Chapter 8 be-gins by introducing the decision-tree model in order to study the performance limi-tations of comparison sorts. Using this model, we prove a lower bound of .n lg n/

on the worst-case running time of any comparison sort on n inputs, thus showing that heapsort and merge sort are asymptotically optimal comparison sorts.

Chapter 8 then goes on to show that we can beat this lower bound of .n lg n/

if we can gather information about the sorted order of the input by means other than comparing elements. The counting sort algorithm, for example, assumes that the input numbers are in the set f0; 1; : : : ; kg. By using array indexing as a tool for determining relative order, counting sort can sort n numbers in ‚.k C n/ time.

Thus, when k D O.n/, counting sort runs in time that is linear in the size of the input array. A related algorithm, radix sort, can be used to extend the range of counting sort. If there are n integers to sort, each integer has d digits, and each digit can take on up to k possible values, then radix sort can sort the numbers in ‚.d.n C k// time. When d is a constant and k is O.n/, radix sort runs in linear time. A third algorithm, bucket sort, requires knowledge of the probabilistic distribution of numbers in the input array. It can sort n real numbers uniformly distributed in the half-open interval Œ0; 1/ in average-case O.n/ time.

The following table summarizes the running times of the sorting algorithms from Chapters 2 and 6–8. As usual, n denotes the number of items to sort. For counting sort, the items to sort are integers in the setf0; 1; : : : ; kg. For radix sort, each item is a d -digit number, where each digit takes on k possible values. For bucket sort, we assume that the keys are real numbers uniformly distributed in the half-open interval Œ0; 1/. The rightmost column gives the average-case or expected running time, indicating which it gives when it differs from the worst-case running time.

We omit the average-case running time of heapsort because we do not analyze it in this book.

Worst-case Average-case/expected Algorithm running time running time

Insertion sort ‚.n²/ ‚.n²/

Merge sort ‚.n lg n/ ‚.n lg n/

Heapsort O.n lg n/ —

Quicksort ‚.n²/ ‚.n lg n/ (expected)

Counting sort ‚.k C n/ ‚.k C n/

Radix sort ‚.d.n C k// ‚.d.n C k//

Bucket sort ‚.n²/ ‚.n/ (average-case)

Order statistics

The i th order statistic of a set of n numbers is the i th smallest number in the set.

We can, of course, select the i th order statistic by sorting the input and indexing the i th element of the output. With no assumptions about the input distribution, this method runs in .n lg n/ time, as the lower bound proved in Chapter 8 shows.

In Chapter 9, we show that we can ﬁnd the i th smallest element in O.n/ time, even when the elements are arbitrary real numbers. We present a randomized algo-rithm with tight pseudocode that runs in ‚.n²/ time in the worst case, but whose expected running time is O.n/. We also give a more complicated algorithm that runs in O.n/ worst-case time.

Background

Although most of this part does not rely on difﬁcult mathematics, some sections do require mathematical sophistication. In particular, analyses of quicksort, bucket sort, and the order-statistic algorithm use probability, which is reviewed in Ap-pendix C, and the material on probabilistic analysis and randomized algorithms in Chapter 5. The analysis of the worst-case linear-time algorithm for order statis-tics involves somewhat more sophisticated mathemastatis-tics than the other worst-case analyses in this part.

In this chapter, we introduce another sorting algorithm: heapsort. Like merge sort,

在文檔中 ALGORITHMS INTRODUCTION TO (頁 160-175)