• 沒有找到結果。

The Secretary Problem: Two-Player Extensions and Going Back

N/A
N/A
Protected

Academic year: 2022

Share "The Secretary Problem: Two-Player Extensions and Going Back"

Copied!
24
0
0

加載中.... (立即查看全文)

全文

(1)

The Secretary Problem: Two-Player Extensions and Going Back

陳伯恩

國立科學工業園區實驗高級中學國小部

Abstract

The Secretary Problem is a thoroughly-studied optimal-stopping problem in which a person must try to select the best applicant from a given number of them. The problem is that he must interview them, must decide whether to accept the applicant just after the interview, and cannot return to previous applicants. The 1/e optimal strategy for this is well known, and many extensions of it have been studied.

Here we will study some more generalizations:

Firstly, a fairly simple generalization is presented that allows the person to return to previous applicants, with a fixed probability of success. Through this problem we will return to the classic problem and demonstrate the ”sum the odds to one and stop” rule, and from that the 1/e optimal strategy.

Next, the main course is two generalizations of the problem in- volving two players each. In the first one, one player is more pow- erful than the other, and when both players want a certain secretary the first player will receive her. We will find the optimal strategy and probability of success for this, then go on to a scenario where the weaker player can choose between helping the stronger player and receiving the privilege of a coin flip, or not helping and remain- ing weak. On the way, we will derive a formula for determining the probability of a set number of candidates (secretaries who are better than previous ones) occuring in a number of applicants.

摘摘要要要: 秘書問題是一個被廣泛討論的問題, 探討在 n 個候選人中如何找 出最佳秘書人選. 面試官必須親自一對一面談來應徵的人, 並在面試後

(2)

馬上決定是否錄取, 而且不能反悔或者回頭去錄取先前面試的人. 最優 策略的成功機率為 1/e, 其方式為: 以前 n/e 個候選人中的最佳者為基 準, 錄取後來遇到的第一位超過這個基準的人選. 本文推廣此問題:

首先, 假設我們容許面試官回頭選擇之前的面試者, 且有一定的成 功機率, 我們發現原先的解, 即 1/e 原則, 仍是最優的.

接下來還有兩種情況. 第一個情況, 有兩個面試官, 其中之一比較有 權力, 當兩人同時想錄取某一位面試者時, 權力較大的那位才能聘用那 位秘書. 我們發現這個情況的策略及成功機率跟原秘書問題是一樣的;

另外, 弱者也有兩種方式可以應對, 第一是幫助強者且得到用硬幣正反 面決定誰是擁有者的反駁機會, 第二是保持低弱的姿態什麼都不做. 在

這過程中, 我們將導出一個公式來算出在若干面試者中, 取出這幾位候

選者 (比前一位秘書好).

1 Introduction

The Secretary Problem is an optimal-stopping problem where we are asked to determine the optimal strategy for a person looking for the best secre- tary, the classic form of which is roughly as follows:

A person is looking for a secretary. He knows that there are exactly n applicants, from which he must choose precisely one secretary. He chooses by interviewing the applicants one by one. After each interview he can rank the applicant, but only relative to the other applicants. He cannot think of the applicant as better or worse than average or expected, be- cause there is no such thing as an average or expected value. Immediately after he has finished interviewing an applicant, he must decide to accept or reject the applicant. He can accept only one applicant out of the n ap- plicants, so once he accepts his game is over and he wins iff his applicant is the best applicant out of the n applicants. Once he has rejected an ap- plicant he cannot return to accept him again, if he ever wishes to do so.

(This restriction is partially removed in our first generalization.) The goal is to maximize the probability of getting the best applicant out of the n applicants.

This problem has already been thoroughly studied, along with many extensions and generalizations. In this paper we will be studying some two-player extensions. Various two-player extensions have already been studied, and the author thinks the simple two-player subordinate/dominant

(3)

problem has probably already been studied, but is not yet sure because he cannot find any of the articles, although there are many similar ones. See [1] for an example of a two-player extension and a more complete discus- sion of previous works on two-player secretary problems. See [2] for a general discussion of many types of extensions of the secretary problem.

2 Preliminaries

First, some definitions. An applicant is one of the secretaries to be inter- viewed. We generally denote the number of applicants (which is already known) by n. A candidate is an applicant that is better than all the appli- cants before him. There may and probably will be multiple candidates; if all the applicants are ordered from worst to best, all the applicants will be candidates. The first applicant is always an candidate. Furthermore, the best applicant (which we will, for simplicity, denote #1) is always the last candidate. This is easy to see. No candidates can come after #1, because, by definition, nobody can be better than #1. And, #1 is always an candi- date because he is better than all the people before him; in fact, better than all other applicants.

All the optimal strategies involved in this problem will be of this form:

let the person reject the first d applicants, and after that, accept the first candidate. We will call the time after the dth person has been rejected and before the next person is interviewed the rejection point. As a number, the rejection point is the position of the last person to be unconditionally rejected. More complex strategies, for example requiring the player’s ran- domization, or analyzing an opponent’s possibly suboptimal strategy, will not be considered.

(4)

1 2

3 4

5 6

7

8

9

As examples to demonstrate our results and logic, we will use dia- grams like this one to denote the ordering of the applicants, where each applicant is a rectangle and their quality is indicated by their height. The candidates have their numbers in bold; note that they form a strictly in- creasing sequence. #1 has a big number. The bold line is the rejection point.

The optimal strategy for the classic problem is as follows. Reject the first d applicants, and accept the first applicant who is better than all of the first d applicants. d is a function of n. As an example, here is an example of the probability Pn(d)of succeeding for all d with n=10.

d Pn(d) Out of 25200

0 10% 2520

1 28.29% 7129

2 36.58% 9218

3 39.87% 10047

4 39.83% 10036

5 37.28% 9395

6 37.28% 8250

7 32.74% 6685

8 26.53% 4760

9 18.89% 2520

10 0% 0

Obviously the optimal strategy in this case is to reject the first 3 ap- plicants. To show how we found these values, we first derive a simple

(5)

formula for the probability of obtaining the best applicant. Let us consider the probabilities. First, if #1 is among the first d applicants to be inter- viewed, he will be rejected no matter what, and the employer loses. If #1 is the (d+1)th applicant, then the employer will always accept him, be- cause he must be better than the first d applicants. More generally, if #1 is the(d+x)th applicant, then we will accept him only if there are no candi- dates in the previous (x−1)applicants. In other words, we get #1 if and only if the best applicant among all the previous people was uncondition- ally rejected, because he would be the last candidate until #1.

As a result, when #1 is the (d+x)th applicant, we get him if the best applicant was in the first d applicants, which occurs with probabil- ity d+dx1. The total probability is therefore

1 n

 d d + d

d+1 + · · · + d n−1



(1) A good solution and introduction I found can be found online at http://www.math.uah.edu/stat/urn/Secretary.xhtml Also, this almost-obvious result about the natural log (actually, any log) will be needed.

nlimln n−ln(n−1) = lim

nln n n−1

=ln 1

=0.

At last, we can begin our first (easy) generalization:

3 Going Back to Rejected Candidates

In the original problem, the interviewer was not allowed to return to pre- vious applicants; once he rejected an applicant, the applicant had no hope of getting chosen. Let us consider a less harsh (and possibly more realistic) version, where the employer can go back to a rejected person but cannot guarantee that he will accept this time; he accepts with a fixed probability.

This generalization is not hard compared to the two-player problems, so we use it as a warm-up.

(6)

We assume that every person has a fixed probability of PA ≥ 0 of accepting when offered the job a second time, and that any person who rejected the second offer cannot be asked again. In this case we have a little lemma.

Lemma 1. The only time the interviewer should go back is after all the applicants have been interviewed.

Before we prove it, here are some examples to make the lemma clear.

Suppose the interviewer has interviewed the first 5 applicants already and finds that the 3rd applicant is the best so far, and wishes to go back to ask him to accept the job. There are basically two scenarios:

1 2

3

4

5 6 7

In the scenario above, the interviewer would gain nothing (but lose nothing either) if he interviews to the end. He still would want to go back to the 3rd applicant, and would succeed with the same probability.

1 2 3

4 5

6

7

In the scenario above, the interviewer would meet the 6th applicant and accept him. Using the original strategy he would get to meet the 6th applicant only if he was rejected by the 3rd applicant. Since PA ≥0, there is always some chance he would get accepted by the 3rd applicant and fail

(7)

in this case. So, at least in this case, interviewing all the applicants is better than interviewing only part of them before going back.

Note, however, that there is still one other case we did not treat. This is when there are two or more candidates have not yet been interviewed.

Here, if the interviewer interviewed until the end without accepting, he would win; but if he interviewed until the first candidate found, he would lose. Still, if he had decided to go back to the 3rd applicant he would lose no matter what as well.

1 2 3

4 5

6

7

Of course, we still must prove the lemma.

Proof. Consider when the interviewer decides to go back after interview- ing k <n applicants. There are two different possibilities:

1. #1 is in those rejected k applicants.

2. #1 has not yet been interviewed.

Arbitrarily we can let the probability of case 1 be P1. What is the probability of success? In case 1, the interviewer wins if #1 accepts the second offer of the job, with probability PA. In case 2, we cannot deduce the probability of winning, so let us just assume it is P2,win.

As a result, the total probability of winning is P1·PA+ (1−P1) ·P2,win, or

PA+ (1−P1) · (P2,win−PA) (2) Compare this to the result when the interviewer goes back after in- terviewing all the applicants. In this case P1 = 1, so the probability of winning is simply PA. Also consider the strategy when the interviewer just keeps interviewing. When P2,win < PA, going back after all the ap- plicants is better; when P2,win > PA, not going back at all is better; when

(8)

P2,win = PA, all three strategies work equally well. All this is obvious from plugging the values into (2). Thus we discover that there is always one strategy that works as good as or better than going back before interview- ing all the applicants.

With this lemma, we can easily discover the optimal strategy. Let the rejection point be d. Then, in the original problem, we have three possible outcomes:

1. The interviewer rejects #1 and thus reaches the end of the applicants without selecting anybody. This happens with probability nd.

2. The interviewer successfully selects #1. The probability is given by (1).

3. The interviewer selects the wrong applicant. The exact probability of this can be easily computed by taking the probabilities of the above two cases from 1, but we will not need it.

In which of these outcomes will the interviewer succeed? In case 1, the interviewer will decide to go back; he already knows who the best applicant is, so he succeeds in case 1 with probability PA. In case 2, the interviewer has already succeeded. The total probability is thus

PG(d) = d

n·PA+1 n

 d

d + d

d+1+ d

d+2 + · · · + d n−1



(3) Just by looking at (3), it isn’t easy to figure out what the optimal value for d is. We can do so, however, by finding the difference between succes- sive terms of PG (that is, the difference in the probability when we add 1 to d):

(9)

∆PG(d) = PG(d+1) −PG(d)

= d+1

n ·PAd

n ·PA+ 1

n

 d+1

d+1 +d+1

d+2+ · · · + d+1 n−1



− 1

n

 d d + d

d+1 + d

d+2+ · · · + d n−1



= PA n + 1

n



d

d + d+1 d+1d

d+1



+ · · · + d+1

n−1d n−1



= PA n + 1

n

 1

d+1 + 1

d+2+ · · · + 1 n−1−1



(4) PA =n∆PG(d) +



1− 1

d+11

d+2− · · · − 1 n−1



(5) Clearly, PG(d)has a local maximum when∆PG(d)changes from pos- itive to negative. Since∆PG(d) is strictly decreasing (when d increases, a term of 1/d disappears and decreases the overall value), the single place where the sign change occurs is the absolute maximum.

Though we now can use (5) to find what values for d maximize the probability of success, it is still cumbersome. For each value of d, we can determine what value PA will let ∆PG(d) be zero. That point is when the strategies to reject d and d+1 applicants are equally likely to succeed.

When PAis greater, rejecting d+1 applicants becomes better; when less, d is better. We construct a table for n = 10 for these values:

(10)

d Pn(d) /25000 Critical PA /25000

0 10% 2520 (Negative)

1 28.29% 7129 (Negative)

2 36.58% 9218 (Negative)

3 39.87% 10047 (Negative)

4 39.83% 10036 0.44% 11

5 37.28% 9395 25.44% 641

6 32.74% 8250 45.44% 1145

7 26.53% 6685 62.10% 1565

8 18.89% 4760 76.39% 1925

9 10% 2520 88.89% 2240

10 0%/10% 0 100% 2520

So, for arbitrary PA in the case of n = 10, we can use the table as follows: find the row with the Critical PA, less than the given PA, that is greatest; that row contains the d that is optimal. For example, if we wanted to find the optimal strategy when the probability of getting accepted the second time is 50%, we would use d = 6, because its PAis 45.44%, which is the greatest value that is still less than 50%.

Now, we consider what occurs when n tends to infinity. We can then approximate the expression

1

d+1 + 1

d+2+ · · · + 1 n−1

with an integral, because as n goes to infinity, the number of terms goes to infinity, and this series is the Riemann sum of the integral

Z n1 d

1

p dp=ln(n−1) −ln d=ln n−ln d=lnn d.

Remember we can assume ln(n−1) = ln n because n is very large, and we proved that when n goes to infinity, the difference goes to zero.

So, plugging back into (4), we get n∆PG =PA+lnn

d −1 (6)

As n increases, when we vary d the changes in ∆PG will get smaller and smaller. So the maximum of PG when n is large should be when∆PG ≈0.

(11)

Solving the previous equation will enable us to find the optimal strategy.

0=PA+ln n−ln d−1 1−PA =ln n−ln d

e1PA =eln nln d e1PA = n

d d= n

e1PA.

Note that the classic problem occurs when PA =0, and then d = n/e, the correct result.

Now we find the actual probability of success, by plugging into (3) PG(d) = d

n·PA+1 n

 d

d + d

d+1+ d

d+2 + · · · + d n−1



= d

n·PA+d n

Z n

d

1 p dp

= d n

 PA+

Z n

d

1 p dp



= d

n(PA+ln n−ln d)

= d n



PA+lnn d



= 1 e1PA



PA+ln

e1PA

= 1

e1PA (PA+1−PA)

= 1 e1PA.

Note that the probability is the same as the rejection point, out of n.

4 The Two-Player Secretary Problem, Dominant vs. Subordinate

After a long period of work, the old secretary decided to retire. The person was left without a good secretary, and again he started looking for a new

(12)

one.

However, another intelligent businessman has come into the town, and he is looking for a secretary too! Furthermore, that businessman pays much more money than our poor man, so that if both decide on the same secretary, he will certainly go to the other businessman.

As in the basic problem, the same assumptions are made: there are n secretaries, after a secretary is rejected that’s it for him, and previous secretaries will not help the person to predict anything about the following secretaries. There are no ties, and our person and the rival businessman interview their secretaries in the same order simultaneously, and they rate the secretaries identically. So, #1 for our person is #1 for the rival.

Let us first consider the strategy for the rival. We will call him the dom- inant player, and our person the subordinate player. If the dominant player wants a certain secretary, he will get it; the subordinate player can in no way interfere with whatever strategy the dominant player uses. Thus, the dominant player can ignore the subordinate player in planning his strategy, so his strategy is the same as our man’s strategy in the classic, one-player problem. Let us call the number of people he rejects uncondi- tionally DD, for Dominant.

The problem is, what is the subordinate player’s optimal strategy?

This is the only remaining problem.

First, to get some grasp of the problem, let us consider a trivial exam- ple: n = 3, with three potential secretaries. The dominant player’s strat- egy is of course to reject the first person and take the second person if he is better than the rejected person. This gives him a winning probability of 12. In other words, DD =1. Let the number of people the subordinate player rejects as a sample be SD. Let’s make a table for each arrangement of sec- retaries, where the best person is 1, the second-best is 2, and the worst is 3.

Arrangement Dominant player SD =0 SD =1 SD =2 1 2 3 Loses (forced to take last) Wins Loses Loses 1 3 2 Loses (forced to take last) Wins Loses Loses

2 1 3 Wins Loses Loses Loses

2 3 1 Wins Loses Loses Loses

3 1 2 Wins Loses Loses Loses

3 2 1 Loses (#2, suboptimal) Loses Wins Wins

(13)

In this case, the optimal strategy for the subordinate player is clearly to take the first person he can get; any more and the dominant player completely dominates him, so to speak. In fact, we can state this as a little lemma:

Lemma 2. The subordinate player’s rejection point comes before the dominant player’s rejection point.

Before we prove this, we must derive a few important formulas con- cerning the probabilities of the subordinate player getting #1.

First, if the subordinate player decides just to reject the same number of people as the dominant player, what happens? The subordinate player wins if there are exactly two candidates after their rejection-points.

How can we find the probability of this?

Let’s look at a more general problem: assume that d people have been rejected and α people remain. What is the probability that there are exactly k candidates among the remaining α applicants?

We can easily start at k =0. The probability is simply the probability that #1 has been rejected, or d+dα. Let n = d+α, which is the number of people.

Let’s define the probability as Ik(d, α). So, I0(d, α) = d+dα.

Now, let us consider this problem generally. For the general case of Ik, we can think of it this way: the last candidate must be #1. So, for each possible position of #1, we can think of all the applicants before #1 as another problem. #1 is always the last candidate, so in those remaining applicants success occurs when there are k−1 candidates. We can think of those remaining applicants as another problem; let the best person in those applicants be #1 for this new problem, and so on. We get another set of values to plug into Ik1. And so on.

Let’s now actually calculate the probability. For success, #1 must be in the last α positions.

Assume #1 is the d+pth applicant. Thus, for the new problem, suc- cess occurs when there are k−1 candidates from position d+1 to d+p−1.

Thus the new problem is Ik1(d, p−1).

#1 is the d+pth applicant with probability d+1α. Thus, the result probability is:

(14)

Ik(d, α) = 1

d+α (Ik1(d, 0) +Ik1(d, 1) + · · · +Ik1(α−1)) (7) This is a beautiful recursive formula, and using the trivial I0(d, α) =

d

d+α we can obtain the familiar equation

I1(d, α) = 1 d+α

 d d + d

d+1 + · · · + d d+α−1



= d d+α

 1 d + 1

d+1 + · · · + 1 d+α−1

 . and the not so familiar

I2(d, α) = 1 d+α

 1

d(0) + 1 d+1

 d d



+ · · · + 1 d+α1

 d

d + d

d+1 + · · · + d d+α2



= d d+α

 1

d(d+1) + 1

d(d+2) + · · ·

+ 1

(d+1)(d+2) + · · · + 1

(d+α2)(d+α1)

 . Now, we are all equipped to prove the lemma.

Proof. Consider when the subordinate player’s rejection point is after the dominant player’s rejection point, and let the subordinate player’s rejec- tion point be SD. We will prove that, when SD is increased by 1, the prob- ability of success decreases.

First, we must find the success probability of this strategy. The ar- rangements in which it succeeds can be classified into two cases:

1. #1 is after the subordinate player’s rejection point, there are no can- didates between the two players’ rejection points, and there is one candidate between the subordinate player’s rejection point and #1.

2. #1 is after the subordinate player’s rejection point, there are one or more candidates between the two players’ rejection points, and there are no candidates between the subordinate player’s rejection point and #1.

(15)

Here are some examples of the two cases. Note that the bolder line is the dominant player’s rejection point, and the thinner one is the subordi- nate player’s rejection point.

1

2

3 4

5 6 7

8

9

10

The first case occurs when #1 is the second candidate after the sub- ordinate player’s rejection point and the best applicant before the subor- dinate player’s rejection point is before the dominant player’s rejection point. The probability is

DD

SD I2(SD, n−SD) = DD n

 1

SD(SD+1) + 1

SD(SD +2) + · · · + 1

(n−2)(n−1)

 .

1 2

3 4 5

6

7

8 9 10

The second case occurs when #1 is after the subordinate player’s rejec- tion point, the best applicant before #1 is before the subordinate player’s

(16)

rejection point, and the best applicant before the subordinate player’s re- jection point is after the dominant player’s rejection point. These two probabilities can be stringed together easily; the probability is

SD−DD n

 1 SD

+ 1

SD+1 + · · · + 1 n−1

 .

Before we continue, note that these formulas degenerate correctly when SD =DD.

Now, we observe the difference in the success probability when SD is changed to SD+1. The probability of case 1 decreases by

DD

n

 1

SD(SD+1) + 1

SD(SD+2) + · · · + 1 SD(n−1)



= DD SD·n

 1

SD+1 + 1

SD+2+ · · · + 1 n−1

 .

The probability of case 2 changes a bit more subtly. SDnDD increases by 1n, but the term S1

D disappears, so the net gain is 1

n

 1

SD+1 + 1

SD+2+ · · · + 1 n−1



SD−DD SD·n . Combine the results, and we find the net gain is

SD −DD

SD·n

 1

SD+1+ 1

SD+2 + · · · + 1 n−1



SD−DD

SD ·n

= SD −DD

SD·n

 1

SD+1+ 1

SD+2 + · · · + 1 n−1−1

 .

Now, obviously, SDSDD

D·n is positive (or zero).

 1

SD+1+ 1

SD+2+ · · · + 1 n−1 −1

 .

is negative because, in the classical problem, DD was the largest inte- ger such that

 1

DD + 1

DD +1+ · · · + 1 n−1−1



>0.

(17)

Since SD ≥ DD, obviously SD +1 > DD, and the above expression must be negative. See (4) if you don’t remember. Since the net gain is negative, increasing SD decreases the probability of success, and thus in an optimal solution SD ≤DD.

Now we can calculate the probability of winning for the subordinate player, when he rejects the first SD people.

Let’s consider what the subordinate player will get before the dom- inant player’s rejection point. There are three possibilities: he wins by getting #1, he loses by getting a suboptimal applicant, or he does not get anybody.

He wins with probability SnD 

1

SD +S 1

D+1+ · · · + D1

D1



, and he does not get anybody if he rejected the best person before the dominant player’s rejection point, with probability DSD

D. Combining these probabilities with the results we have of I2, the total probability of the subordinate player winning is:

PS = SD n

 1

SD + 1

SD+1 + · · · + 1 DD−1

 + SD

DD (I2(DD, n−DD))

= SD n

 1

SD + · · · + 1

DD−1+C2

 , where C2 = 1

DD(DD+1) + 1

DD(DD+2) + · · · + 1

(n−2)(n−1). Note that the part with a brace under it is constant, so we’ll call it C2. This monstrous expression, though constant, will likely be very hard to calculate. However, if you have a fairly advanced calculator, you can find that C2 when n = 10 is 10631440 ≈ .7382. But C2 for n = 100 is about .49415.

Does C2for large n converge to12?

The first step is to convert it to a summation and then to an integral;

note that one of the limits must be one more/less than the limits of summa- tion because, if you think of the integral as a series of infinitesimal-width rectangles, there needs to be an extra rectangle at one end to represent the last term. Still, the difference of 1 is negligible for big n. The outer integral increases the upper limit, whereas the lower integral decreases the lower

(18)

limit in this case. Aside from making the integral easier to evaluate, the resulting integral will be closer to the actual value, because one integral will make the answer slightly higher while the other will make it slightly lower, for large n.

C2 = 1

DD(DD+1) + 1

DD(DD+2) + · · · + 1

(n−2)(n−1)

=

n2 p1

=DD

n1 p2=

p1+1

1 p1p2

Z n1 DD

Z n1 q1

1

q1q2 dq2 dq1

= Z n1

DD

1 q1

(ln(n−1) −ln(q1))

= ln(n−1)ln(q1) − (ln q1)2 2

n1

DD

= ln(n−1)ln(n−1) −(ln(n−1))2 2

ln(n1)ln(DD) −(ln DD)2 2

= 1

2(ln(n−1))2ln DD(ln(n−1) −1

2ln DD)

= 1 2



(ln(n−1))2−ln DD(2 ln(n−1) −ln DD)

= 1 2



(ln(n−1))2−2 ln(n−1)ln DD+ (ln DD)2

= 1

2(ln(n−1) −ln DD)2

1 2



ln n−lnn e

2

= 1 2

 ln n

n e

2

= 1

2(ln e)2

= 1 2.

(19)

and our suspicions are confirmed.

Now we can use the same trick to find the optimal value: find the difference between successive terms.

PSSD n

 1 SD

+ · · · + 1

DD−1+1 2



PS(SD +1) −PS(SD) = SD+1 n

 1

SD+1+ · · · + 1

DD−1 +1 2



SD n

 1 SD

+ · · · + 1

DD−1 +1 2



= 1 n

 1

SD+1+ · · · + 1

DD−1+1 2



SD n

1 SD

= 1 n

 1

SD+1+ · · · + 1

DD−1−1 2

 .

Now we can approximate the probability with an integral again! Also, let us denote the difference with∆.

∆PS = 1 n

Z D

D

SD+1

1

p dp−1 2



= 1 n



ln DD−ln(SD+1) − 1 2



1 n



ln DD −ln SD1 2



= 1 n

 lnDD

SD

1 2



n∆PS =

 lnDD

SD

1 2

 . Solve for zero:

0=

 ln SD

DD

+ 1 2



Raise e to the power of both sides:

(20)

1=e

lnSD

DD+12

=elnDDSDe12

= SD DDe12 SD = DDe12. and plug in how DD converges to ne:

SD = n

e32 (8)

That’s our final result! The ratio is about 22.31%. Finally, let us find the subordinate player’s probability of success.

PSSD n

 1

SD + · · · + 1

DD −1+1 2



SD n

Z D

D

SD

1

p dp+1 2



→e32

Z ne−1

ne− 32

1

p dp+1 2

!

=e32

 ln

ne1

lnne32+1 2



=e32  1 2 +1

2



=e32.

Note, interestingly, that the probability of success is the same as the rejection point, just as it was in the classic problem.

5 The Experience Problem

Now, we consider the Experience Problem, which is the result of extend- ing the previous extension again, but giving the subordinate player more advantage. First, the problem statement.

(21)

As in the previous problems, there are n secretaries in total, and the subordinate player interviews them one by one. He interviews the first s applicants by himself. After the first s applicants have been interviewed, the dominant player steps in and, together and simultaneously, under the assumptions of the previous extension, they interview the remaining n−s secretaries together. This means that the dominant player will not be able to meet the first s applicants, although he knows of their existence.

First, we must find the strategy of the dominant player, which is no longer as trivial as in the previous extension. The dominant player wins if and only if both these conditions hold:

• #1 is one of the last n−s secretaries that the dominant player can interview.

• The dominant player gets the best applicant in the last n−s secre- taries.

If both hold, the best applicant in the n−s would be #1, and the dominant player would receive him. Furthermore, the events are independent, and the dominant player only has control over the second event. To maximize its probability, he would therefore use the classic ne strategy for the people he can interview, and his rejection point would be n+s(ee1).

When does the subordinate player win? Obviously one case occurs when the first candidate he meets is #1 before the dominant player even starts interviewing. Also, #1 could come after both players are looking for the next candidate; however, because the dominant player does not get to see s of the applicants, he may recognize some applicants as candidates that the subordinate player does not. We call these false candidates, and we call the candidates that actually are candidates from both players’ point of view true candidates. As a result, to remove the dominant player, there may be one true candidate (note that no false candidates can follow any true candidate), or many false candidates.

• #1 is the first candidate between the two players’ rejection points.

• There are no candidates between the two players’ rejection points; #1 is after the dominant player’s rejection point; and either of:

(22)

Between the dominant player’s rejection point and #1, there are at least one false candidate and no true candidates.

Between the dominant player’s rejection point and #1, there are one true candidate and no false candidates. (Any false candi- dates would have to precede the true candidate and remove the dominant player from the game; then, the subordinate player would take the true, suboptimal candidate.)

The first case, where #1 is the first candidate between the two players’

rejection points, is simple, because we can ignore the dominant player:

SD

n

 1

SD + 1

SD+1 + · · · + 1 DD −1



SD

n (ln DD −ln SD)

The next case is in which the dominant player receives a false candi- date. For success in this case, the best applicant before #1 cannot be seen by the dominant player, and must be part of the first s people. Also, the best applicant before #1 and after the first s people must be after the dom- inant player’s rejection point, ensuring that he will see a false candidate.

Thus, again summing for each position where #1 could be, we obtain

1 n

 s

DD+1

1

DD+1−s + s DD+2

2

DD+2−s + s n−1

n−1−DD n−1−s



s n

Z n

DD+1

p−DD

p(p−s) dp

= s n

Z n

DD+1 DD

s

p +1DsD p−s dp

s n

 DD

s (ln n−ln DD) +



1−DD s



(ln(n−s) −ln(DD −s))



= s n

 DD

s (ln n−ln(n+s(e−1)) +1) +



1− DD s



= s n

 DD

s (ln n−ln(n+s(e−1))) +1



= DD

n (ln n−ln(n+s(e−1))) + s n.

(23)

The last case is the same as in the second extension; two candidates must come after #1, so the probability is also the same:

SD

2n

Now, of course, we find the difference between successive terms.

SD+1

n (ln DDln(SD+1)) + SD+1

2n +DD

n (ln n−ln(n+s(e−1)) + s n

SD

n (ln DD−ln SD) +SD

2n +DD

n (ln n−ln(n+s(e−1)) + s n

= ln DD

n + 1

2n −1

n((SD+1)ln(SD+1) −SDln SD)

= ln DD

n + 1

2n −1 nln

(SD+1)SD+1 SDSD



= ln DD

n + 1

2n −1

nln  SD+1 SD

SD!

1 n

= 1 n



ln DD −ln SD1 2

 .

Solving for zero is straightforward.

0 = 1 n



ln DDln SD1 2



0 =ln DD−ln SD1 2 ln SD =ln DD1

2 SD =DDe12.

This is actually the same result that we arrived at in the previous sec- tion. Why? The reason is actually quite simple: the only difference in probability is the case where the dominant player meets a false candidate, and the probability of success here is constant. As a result, the optimal re- jection point is the same, but not the probability of success. The difference in the results is simply because the dominant player has decided to follow

(24)

a strategy that is not optimal from the subordinate player’s point of view.

Finally, we plug back and solve the Experience problem!

SD

n (ln DD −ln SD) +SD

2n +DD

n (ln n−ln(n+s(e−1)) + s n

= DDe

12

n



ln DD−ln DDe12

+DDe

1 2

2n +DD

n (ln n−ln(n+s(e−1)) + s n

= DDe

12

2n + DDe

12

2n + DD

n (ln n−ln(n+s(e−1)) + s n

= DDe

12

n + DD

n (ln n−ln(n+s(e−1)) + s n.

References

[1] N. Immorlica, R. Kleinberg, and M. Mahdian. Secretary Problems with Competing Employers. Proceedings of the 2nd Workshop on Internet and Network Economics, pages 389–400, 2006.

[2] T. S. Ferguson. Who solved the Secretary Problem? Statistical Science, Vol. 4, No. 3, 282–296, 1989.

參考文獻

相關文件

Too good security is trumping deployment Practical security isn’ t glamorous... USENIX Security

陳伯恩。 The Secretary Problem: Two-Player Extensions and Going

了⼀一個方案,用以尋找滿足 Calabi 方程的空 間,這些空間現在通稱為 Calabi-Yau 空間。.

Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions,

Courtesy: Ned Wright’s Cosmology Page Burles, Nolette &amp; Turner, 1999?. Total Mass Density

Abstract We investigate some properties related to the generalized Newton method for the Fischer-Burmeister (FB) function over second-order cones, which allows us to reformulate

 name common laboratory apparatus (e.g., beaker, test tube, test-tube rack, glass rod, dropper, spatula, measuring cylinder, Bunsen burner, tripod, wire gauze and heat-proof

專案執 行團隊