An Illustrative Example of the Pessimistic Situation

Chapter 4 Structural-reduction Approach

4.3 An Illustrative Example of the Pessimistic Situation

In order to clarify the key idea of the pessimistic situation (worst case) of 3×n AB games we have discussed above, a 3×20 AB game, which is a 3×n AB game while n = 20, is taken as an illustrative example. The scenario is shown in Figure 14. Suppose that the set of symbols is S = {c0, c1, …, c19}. In the first ply, the codebreaker makes the first query, c0

c

2, and the codemaker offers [0, 0] as the first response which is the worst-case response. Thus, the 3×20 AB game reduces to a 3×h AB game, where

h = 17. The similar operations proceed at the second and third queries. After the third

query and third response, the original 3×20 AB game reduces to a 3×11 AB game.

The minimum number of queries can not be obtained easily with the use of analyses when h ≤ 11 because of the irregular behavior. Hence, a branch-and-bound search algorithm, which has been proposed in Chapter 2, is applied to find an optimal strategy for smaller h.

Figure 14. The scenario of the pessimistic situation of a 3×20 AB game

4.4 Chapter Conclusion

From the above discussions, the optimal query for the codebreaker and the adversary response for the codemaker, which refers to the worst case for the codebreaker as well, are eventually obtained with the consideration of the special state

C

^*. In the follow-up, all results mentioned above will be concluded to derive a theorem.

Theorem 2. For a 3×n AB game, the minimum number of queries for the

codebreaker in the worst case is

⎣ ⎦ ( )

⎣ ⎦

⎩ ⎨

⎧

≥ +

+

≤

≤ +

. 8 if , 3 3 1

7 3

if , 3

3 n n

Proof. At the beginning of a 3×n AB game, the n symbols are not used and then all

secret codes are all equivalent. As a result, a secret code is chosen randomly as the first query for the codebreaker. Nine substates are therefore produced and [0, 0] is taken as an adversary response according to Lemma 6. Afterwards, C[0,0], which results from the first response, matches the attribution of the special state C^* described in Lemma 5. Thus, Lemma 5 can be applied to this state. We find that the situations

mentioned in Lemma 5 and Lemma 6 will appear alternately in the following gaming process. So we have the following recurrence.

( ) ( n

T n

−3

)

+1, when

n

>11.

T

Because of the irregular behavior of a 3×n AB game with a smaller value of n, its minimum number of queries can be obtained with the use of a branch-and-bound search algorithm, which originates from Chapter 2, when n ≤ 11. After the use of computer programs written with this approach, the minimum numbers of queries required for the codebreaker in the worst case are obtained in several hours and they are 4, 4, 4, 5, 5, 6, 6, 6, and 7 respectively when n = 3, 4, 5, 6, 7, 8, 9, 10, and 11. For example, an optimal strategy for 3×7 AB game is considered with S = {0, 1, 2, 3, 4, 5, 6}. If the codemaker takes 165 as a secret code, a gaming process in the worst case will be as follows: 012, [0, 1], 023, [0, 0], 041, [0, 1], 156, [1, 2], 165, [3, 0]. In other words, the codebreaker requires 5 queries to identify 165 while playing the worst-case optimal strategy.

We derive the above recurrence and conclude with the results of smaller values of

n. Hence, the closed form of the formula is exhibited as follows.

⎣ ⎦ ( )

⎣ ⎦

⎩ ⎨

⎧

≥ +

+

≤

≤ +

. 8 if , 3 3 1

, 7 3

if , 3

3 n n

This completes the proof.

Partial results of 3×n AB games, 3 ≤ n ≤ 16, are summarized in Table 12. As 3×n AB games have been solved successfully, a natural generalization is to explore the techniques for m×n AB games, where m ≥ 4. This problem remains open.

Table 12. The minimum number of queries for 3×n AB games in the worst case

n 3 4 5 6 7 8 9 10 11 12 13 14 15 16

# of queries 4 4 4 5 5 6 6 6 7 7 7 8 8 8

Chapter 5 Optimization Algorithm and

Verification Algorithm

This chapter introduces two algorithms, called the two-phase optimization

algorithm (TPOA) and pigeonhole-principle-based verification algorithm (PPV) to

investigate the game, AB game with an unreliable response. TPOA was proposed by us in [17] and was proved to be an effective approximate algorithm for deductive games. PPV is modified slightly from the pigeonhole-principle-based fast backtracking algorithm in [37], which was also demonstrated by us. Section 5.1 gives a comprehensive introduction for our problems while some notations are redefined here to match the properties of the handled problem. Section 5.2 provides an introduction to TPOA and its performance. In Section 5.2.3, PPV is illustrated and the verified results are also shown. Section 5.4 contains the summary of our remarks.

5.1 Introduction

In this chapter, a variant of AB game, which is called AB game with an unreliable response, is presented. The game is the same as 4×10 AB game in addition to the concept of fault tolerance added to the variant. In other words, there is an

additional rule in the game ⎯ the codemaker is allowed to give at most a wrong response. For example, it is a wrong response if the codemaker answers [1, 0] instead of [1, 2] if the codemaker chooses “2134” as a secret code and the codebreaker makes a query “0123”. Furthermore, the termination criterion of the game is modified in order to fit in with the area of fault tolerance. That is, the game is over if there is only one eligible code now. In short, it is not necessary for the codebreaker to figure out the secret code but to acquire it in his mind.

AB game with an unreliable response has ever been studies by us [37]. That results show that the upper bound of the required number of queries in this game is 9 while the lower bound of it is 8. Unfortunately, the two bounds are not the same and then, two more effective algorithms will be exhibited in this chapter to decide the exact bound of it.

〈{0, 1, 2}, {}〉

g_1,2 = 1

〈{0}, {1, 2}〉

g_2,1 = 0

〈{1}, {0, 2}〉

g_2,2 = 1

〈{2}, {0, 1}〉

g_2,3 = 2

〈{}, {0}〉〈{0}, {}〉〈{}, {0, 1, 2}〉

g_3,2 = 1

〈{}, {0, 1}〉

g3,1 = 0

〈{}, {1, 2}〉

g3,2 = 1

〈{1}, {}〉〈{}, {0, 1, 2}〉

g_3,2 = 1

〈{ 2}, {}〉〈{}, {2}〉

〈{}, {0}〉〈{}, {1}〉〈{}, {2}〉〈{}, {0}〉〈{}, {1}〉〈{}, {1}〉〈{}, {2}〉〈{}, {0}〉〈{}, {1}〉〈{}, {2}〉

< = >

< = > < = > < = >

< = > = > = > < = >

Figure 15. A game tree for the 1×3 game with an unreliable response

In order to clarify the problem and our proposed methods precisely, here we redefine some notations, which may have been defined in Chapter 1, to match the properties of AB game with an unreliable response. Consequently, a simple number

guessing game, denoted 1×n games with an unreliable response, is taken as an illustrative example to explain these new notations. In the 1×n games with an unreliable response, the codemaker chooses a secret code c, c = {0, 1, 2, …, n − 1}.

After each query g made by the codebreaker, the codemaker gives him a response r, r

= {<, =, >}, i.e., they stand for g < s, g = s, and g > s. The codemaker is allowed to give at most a wrong response in this game. The goal of the game is to obtain the secret code by using as few queries as possible. We can represent the gaming process as game-tree search. For instance, a game tree for the 1×3 game with an unreliable response consisting of internal nodes and leaves is shown in Figure 15.

Definition 15. The state

C_i^{( )}⁰ ,C_i^{( )}¹ consists of two sets, which are composed of eligible codes after the codebreaker makes the i-th query. The first set

is the set of secret codes which satisfy all previous responses and represents the set of secret codes which satisfy all but one of the previous responses. For example, the root in

( )0

( )1

Figure 15 is

{ } { } 0 , 1 , 2 ,

which indicates that the elements in C0^{( )}⁰ are 0, 1, and 2 while is an empty set.

( )1

Definition 16. A weight, (

Ci^{( )}⁰ ^,Ci^{( )}¹

)

, is a couple of natural numbers. The first number is the size of the set Ci^{( )}⁰ and the second number is the size of the set Ci^{( )}¹ . For instance, the weight of the root in Figure 15 is (3, 0).

Definition 17. The query g

i,j made by the codebreaker means that the query is the j-th choice among all valid queries with respect to the current state and

(i−1) queries have been made previously. In Figure 15, “g3,2 = 1”

means that it is the third query and the query is 1.

Definition 18. There are 14 legal responses in AB game. After the codebreaker makes

the (i+1)-th query and the (i+1)-th response offered by the codemaker is j, this query will divide each set of the current state C_i^{( )}⁰,C_i^{( )}¹ into 14 subsets,

R

_i^{( )}₊⁰₁_,_j,

R

_i^{( )}₊¹₁_,_j ,

j

=1,2,K,14 . In other words,

and .

( ) ( )0 14

1 0

1 i

j Ri j =C

= +

U U

¹⁴^j⁼¹^Rⁱ^{( )}¹⁺¹^,^j ⁼^Cⁱ^{( )}¹

Definition 19. A final state is the state which is

Ci^{( )}⁰ ^,Ci^{( )}¹ and Ci^{( )}⁰ ⁺ Ci^{( )}¹ ⁼¹. In other words, only one eligible code remains in the final state and the game is over.

From the above definitions, the accurate relation of the states in each ply can be derived. Suppose that the codemaker offers j as the (i+1)-th response after the (i+1)-th query. The codebreaker has to consider whether the response j is correct or not. Hence, there are two possible cases discussed below.

If the response is correct, the states we have to consider now are therefore and .

( )0 , 1 j

Ri₊ R_i^{( )}₊¹_{1 j}_,

If the response is wrong, we need to think of this state, ₁_≤_p

U

_≤₁₄_,_p^R_≠ⁱ^{( )}_j⁺⁰¹^,^p .

Before the game starts, we know that C0^{( )}⁰ is the set that contains all valid secret

codes and C₀^{( )}¹ =φ. From the two discussed cases, we have the following relations.

( ) ( )0 , 1 0

1 i j

i R

C₊ = ₊ ,

( ) ( ) ( )

⎟⎟

⎠

⎞

⎜⎜

⎝

= ⎛

≠

≤

≤ + +

+ ⁱ ^j^U _p

U

_p ⁱ_j ^p

i R R

, 14 1

0 , 1 1

, 1 1

During the gaming process, the secret codes, which dissatisfy the previous responses just one time, will be moved from Ci^{( )}⁰ to Ci^{( )}¹ . If the secret codes in Ci^{( )}¹

dissatisfy a response again in the future, we do not have to consider these codes in the following plies.

5.2 Two-Phase Optimization Algorithm

The two-phase optimization algorithm (TPOA) was originally proposed by us to solve Mastermind [17]. It is an approximate algorithm and is able to discover results with higher quality. TPOA can also be thought as a general improver for heuristic strategies. That is, given a heuristic, TPOA has higher chance to obtain results better than those obtained by the heuristic. Moreover, it sometimes can achieve near-optimal results that are difficult to find by the given heuristic.

In this section, we will attempt to apply TPOA to discover the upper bound of the number of queries for AB game with an unreliable response. We first review the properties of TPOA and the hashing collision group that is used in TPOA. Second, a well-designed hashing function and the heuristic of evaluation are provided. Finally, TPOA is utilized to address the game.

5.2.1 The Structure of TPOA

The search tree of TPOA, abbreviated to TPOA tree, is divided into two phases, exploration and exploitation. The objective of exploration phase is to discover promising partial solutions; on the other hand, the exploitation phase is to choose the way that leads each of the partial solution to a “best” complete solution. Two parameters, the branching factor k and the exploration depth d, are used to decide how large the search space TPOA intends to explore. That is, the parameters determine how many potential (promising) solutions that TPOA will exploit.

We [17] have presented two versions of TPOA, which are TPOA⁺ (k, d) and TPOA^*(k, d), in the previous study. Because a larger search space may be required to get a better upper bound of the game, only TPOA⁺ (k, d) is adopted to investigate our problem. TPOA⁺ (k, d) indicates TPOA with a branching factor of k and an exploration depth of d. The TPOA⁺ (k, d) tree is shown in Figure 16. Given a TPOA tree with an arbitrary height h, after level d the algorithm does a greedy search form that node on. The number of potential solutions exploited in a TPOA⁺

(k, d) tree will

be k^d.

...

k ...

...

k ...

...

Exploration Phase

...

... ... ...

Exploitation Phase

...

d

h-d

...

Figure 16. The construction for TPOA⁺ (k, d) tree

The structure and properties of TPOA are described now. Given parameters (k,d), the sketch of a recursive procedure for TPOA is shown in Figure 17. TPOA can be implemented by a modified exhaustive depth-first search on a TPOA tree. The main modification to depth-first search is that at each visited node in the exploration phase (within depth d), we consider only b branches and ignore other branches. In Figure 17, TPOA⁺ has a fixed b (= k) in the exploration phase, as shown in line 3. In the

exploitation phase, TPOA⁺ has a fixed b = 1 in line 4. Therefore, TPOA⁺(k, d) is able to prune a huge search space to a manageable size k^d as shown in Figure 16. For AB game, since the 14 response nodes at each level should be kept, the search space is reduced to (14×k)^d.

1 2 3 4 5 6 7

TPOA(k, d, b, c) { l = Current_level();

If (c is a complete solution) Then Return c;

If (l < d) Then b = k;

Else b = 1;

For (each move m ∈ M) i = Hash(m);

HCGi ← HCGi ∪ {m};

// k, d: the given constants

// get the current level in the TPOA tree // in the exploration phase

// in the exploitation phase

// M: the set of all next potential moves // classify possible next moves to HCGs by a

hash function

B = {HCG

j | HCGj is the top b groups that could obtain promising results};

9 10 11 12 13 14 15 16 17

For (each HCG

i ∈ B)

c

i = Choose(HCGi);

C = C ∪ { c

i };

S ← ∅;

For (each c

_i ∈ C)

s

i = TPOA(k, d, b, ci

);

S ← S ∪ { s

i };

c = Max

_{si ∈ S} (eval(si));

Return c;

}

// B: the set of b selected HCGs

// ci: the selected representative for HCGi

// C: the set of b representatives ci in B // S: the set of potential solutions from

descendant nodes

// recursively b-way search to find the best solution from descendant nodes

// select the best solution discovered in S // return c to the parent node.

Figure 17. The sketch of TPOA

Given two constants (k, d), the time complexity of TPOA⁺ (k, d), in terms of number of nodes exploited, is k^d (h − d), where h is the height of the game tree, i.e., the number of queries required in the worst case. This means that no matter how large an instance of problem is given, TPOA can always obtain an approximate result by appropriately selecting the parameters (k, d). Furthermore, depending on the execution time and space allowed, the value of parameters (k, d) can be increased to approach the optimal result. Now, the fundamental components of TPOA are

summarized as follows:

A constructive heuristic for the problem at hand

A hash function according to the heuristic

Two parameters (k, d) to decide how large the search space TPOA intends to explore

5.2.2 Hash Collision Groups

In TPOA, how to select the (most likely) best b next potential components is a critical issue. The problem can be effectively and efficiently solved by a clustering approach. TPOA performs clustering using a concept of hash collision groups [14], which are abbreviated to HCGs. The next potential components of solutions with similarity are clustered together in an HCG by a given hash function to the problem at hand. That is, the potential components with the same hash value will be clustered together. Section 5.2.3 will give detailed examples of how the clustering mechanism works. Properties of HCGs are now described. Figure 18 illustrates the relation between HCGs and equivalent classes in a search space of next potential components.

There are several advantages of using HCGs in TPOA. The important properties of HCGs include:

For two components in the same HCG, they are most likely equivalent. On the other hand, for two equivalent components, they are definitely in the same HCG.

Given a hash function, it is efficient to obtain the b best HCGs.

Without losing the generality, an arbitrary component can be chosen to represent its HCG.

Therefore, TPOA is able to efficiently and effectively select the b “best”

representatives among all next potential components. On the other point of view, if an

evaluation function is used in TPOA, each HCG can be regarded as a set of the next potential components which have a tie on the return value of the function. Note that most ties are equivalent but equivalent solutions will produce ties.

Components

Equivalent classes HCGs

Figure 18. The relation between HCGs and equivalent classes

5.2.3 TPOA for AB game with an Unreliable Response

In this section, TPOA will be applied to our problem, AB game with an unreliable response. Figure 19 shows the game tree by applying the TPOA to this problem. Among them,

C

_i^{( )}_,⁰_j,

C

_i^{( )}_,¹_j is the j-th state, i.e., the j-th class (response), after the i-th query. And gi,j is the j-th among the k best codes chosen by the TPOA at the

i-th query.

According to the hashing function, which will be demonstrated in Section 5.2.4, all valid queries are categorized into several HCGs and the representative of each HCG is evaluated in order to select k best codes as the explored queries. The designed hashing function and the heuristic of evaluation are described in detail in the next subsection.

In the beginning, the initial state is the root of the game tree in Figure 19, which means that there are totally 5040 queries satisfying all previous responses. Note that while the codebreaker takes the first query into account, TPOA chooses the k best codes, g1,1, g1,2, …, g1,k, to conduct this search. After that, there are 14 classes which have to be expanded since the codemaker has 14 legal responses. Then the

codebreaker selects k best queries to expand the game tree again after the first response is determined. The two steps take turns until the final state is met. At final state, the program backtracks to its parent node and expands other branches continuously.

14 classes

[4,0] [3,0] [2,2] [0,0] [2,1]

• • •

g_2,1

g_2,k g_2,2

k queries

( )0

Ci : the set of eligible codes which satisfy all previous responses

( )1

Ci : the set of eligible codes which satisfy all but one previous responses

• • •

( ) ( )

14 classes

• • •

0 1

0 ,C

( ) ( )1 1 , 1 0 1 , 1 ,C

C C₁^{( )}_,⁰₂,C₁^{( )}¹_,₂ ^{( )} 1^{( )}¹,3 0 3 , 1 ,C

C C₁^{( )}_,⁰₄,C₁^{( )}¹_,₄ C₁^{( )}⁰_,₁₄,C₁^{( )}¹_,₁₄

( ) ( )1 1 , 2 0

1 , 2 ,C

C C₂^{( )}⁰_,₂,C₂^{( )}¹_,₂ C₂^{( )}⁰_,₁₄,C^{( )}₂¹_,₁₄

• •

•

• •

•

g_1,1

g_1,k g_1,2

k queries

• • •

( ) ( )1 1 , 0

1 ,, _d

d C

C ^{( )} ^{( )}¹,2

0 2 ,, d

d C

C ^{( )}0 ^{( )}¹,14

14 , , d

d C

• •

•

• •

•

• •

•

• •

•

g_d+1,1

1 query

g_{d+1, 1} g_{d+1 ,1}

1 query 1 query

Exploration Phase

Exploitation Phase

Figure 19. The game tree expanded by TPOA

5.2.4 The Hashing Function and the Heuristic of Evaluation

Now, a hashing function is designed carefully and a simple heuristic proposed by Barteld [6] is utilized to cooperate with TPOA. Although the two methods are uncomplicated, they are adequate to solve our problem.

Hashing function for TPOA:

Suppose given a state, Ci^{( )}⁰^,Ci^{( )}¹ , let the sizes of the 14 response classes (states), which result from ^C_i^{( )}⁰ , after a query g be ^{( )} ^{( )} ^{( )}0 ^{( )}⁰1,14

2 , 1 0

1 , 1

0 = _i₊ , _i₊ , , _i₊

g R R R

S K while the

sizes of the 14 response classes (states) resulting from Ci^{( )}¹ , after a query g is

( ) ( ) ( ) ( )1

14 , 1 1

2 , 1 1

1 , 1

1 = _i₊ , _i₊ , , _i₊

g R R R

S K . Afterwards, the hash function sorts the original two

sequences, ^S_g^{( )}⁰ and ^S_g^{( )}¹ , into nonincreasing sequences, ^S_g^{( )}⁰ and ^S_g^{( )}¹ , independently. The hash function is therefore defined as follows:

( ) ( )

(

Sg⁰^,Sg¹

)

Sg^{( )}⁰^,Sg^{( )}¹ ^,

Hash =

In other words, assume that two queries, g and p, are considered. If S_g^{( )}⁰ =S_p^{( )}⁰ and

( )1 ( )1 p

g S

S = , then the query g and the query p are classified into the same HCG.

Remember that we also guarantee the fundamental properties of the designed hashing function that (1) for two components in the same HCG, they are most likely equivalent, and that (2) for two equivalent components, they are definitely in the same HCG. Therefore, we can arbitrarily choose a secret code to represent its HCG, rather than exhaustively explore all secret codes in the HCG, and obtain an approximate result.

Heuristic of evaluation:

In the previous analyses, the height of the game tree has to be minimized so as to obtain the optimal strategy for the game in the worst case. However, it is not intuitive

to determine the significance between the number of codes in Ci^{( )}⁰ and that of codes

in . Hence, a simple and efficient heuristic, called “most-parts heuristic”, demonstrated by Barteld

( )1

[6] is used in TPOA. The most-parts heuristic focuses on the

“breadth” the eligible secret codes can be spread. In other words, the more classes the eligible secret codes can occupy after a query, the more favorable this query is.

Because a state in our problem has two sets, e.g., Ci^{( )}⁰^,Ci^{( )}¹ , the most-parts heuristic has to sum up the number of the nonzero numbers in and that of

nonzero numbers in according to a query g. The higher the score is, the better the query is. For example, the query g is better than the query p if the numbers of parts caused by g and p are 24 and 18 respectively.

( )0

( )1

5.2.5 Experiment Results of TPOA

When our program based on TPOA was implemented and tested, we ran it on a dedicated PC equipped with an Intel Core 2 Duo CPU whose frequency is 3.16 GHz.

In order to accelerate the running time of TPOA furthermore, another technique is implemented as well. That is, during the searching process, TPOA will terminate as soon as it has found a strategy, in which the minimum number of queries is 8 in the worst case. Thus, this may reduce the necessity to search all the possible pathways in the search space shown in Figure 19, and result in faster finish time.

The results are shown in Table 13. Basically, the larger the values of k and d are, i.e., the larger the search space is, the fewer the number of queries required for the game is, and the longer the time for running the program is. However, the results in Table 13 do not always seem to show this trend. This is because by using the above speed-up technique, TPOA stops if a strategy with 8 queries required in the worst case is found. In other words, TPOA will stop more quickly if the order of the traversal

sequences of the k queries in each ply is decided carefully. In our program, the order of the traversal sequences is completely determined by the most-parts heuristic to choose the k best queries in each ply. From the results in Table 13, it reveals that the most-parts heuristic is quite outstanding because the running time is shorter when k = 7 and d = 7.

Table 13. The upper bound derived by our program

k d

The number of queries in the worst case Running time (Minutes)

1 1 10 3.96

2 6 10 21.60

3 3 10 28.43

5 4 9 319.47

5 5 9 641.47

7 7 8 13.87

Note that the number of queries, whose value is 8, is obtained by our program when k = 7 and d = 7. This shows that the TPOA can efficiently obtain optimal (or

在文檔中較高維度演繹競局問題最佳演算法之設計與分析 (頁 70-0)