Chinese Restaurant Game - T-REX: A Trading-based Resource Exchange Mechanism

5.5 T-REX: A Trading-based Resource Exchange Mechanism

6.1.3 Chinese Restaurant Game

Chinese restaurant process, which is a non-parametric learning methods in machine learning [90], provides an interesting non-strategic learning method for unbounded num-ber of objects. In Chinese restaurant process, there exists infinite numnum-ber of tables, where each table has infinite number of seats. There are infinite number of customers entering the restaurant sequentially. When one customer enters the restaurant, he can choose either to share the table with other customers or to open a new table, with the probability being predefined by the process. Generally, if a table is occupied by more customers, then a new customer is more likely to join the table, and the probability that a customer opens a new table can be controlled by a parameter [91]. This process provides a systematic method to construct the parameters for modeling unknown distributions. Nevertheless, the behavior of customers in Chinese restaurant game is non-strategic, which means they

follow predefined rules without rational concerns on their own utility. We observe that if we introduce the strategic behaviors into Chinese restaurant process, the model can be a general framework for analyzing the social learning with negative network externality.

To the best of our knowledge, no effort has been made to bring rationality concerns into such a decision making structure in the literature.

By introducing the strategic behavior into the non-strategic Chinese restaurant process, we propose a new game, called Chinese Restaurant Game, to formulate the social learn-ing problem with negative network externality In our previous work [92], we have studied the simultaneous Chinese restaurant game without social learning where customers make decisions simultaneously. In this chapter, we will study the sequential Chinese restaurant game with social learning where customers make decisions sequentially. Let us consider a Chinese restaurant with J tables. There are N customers sequentially requesting for seats from these J tables for having their meals. One customer may request one of the tables in number. After requesting, he will be seating in the table he requested. We assume that all customers are rational, i.e., they prefer bigger space for a comfortable dining experience.

Thus, one may be delighted if he has a bigger table. However, since all tables are available to all customers, he may need to share the table with others if multiple customers request for the same table. In such a case, the customer's dining space reduces, due to which the dining experience is impaired. Therefore, the key issue in the proposed Chinese restaurant game is how the customers choose the tables to enhance their own dining experience. This model involves the negative network externality since the customer's dining experience is impaired when others share the same table with him. Moreover, when the table size is unknown to the customers, but each of them receives some signals related to the table size, this game involves the learning process if customers can observe previous actions or signals.

In the rest of this chapter, we first provide detailed descriptions on the system model of Chinese restaurant game in Section 6.2. Then, we study the sequential game model with perfect information to illustrate the advantage of playing first in Section 6.3. In Section 6.4, we show the general Chinese restaurant game framework by analyzing the learning

behaviors of customers under the negative network externality and uncertain system state.

We provide a recursive method to construct the best response for customers, and discuss the simulation results in Section 6.5. In Section 6.6, we illustrate how the traditional spectrum access problem can be formulated as a Chinese restaurant game. Finally, we summarize this chapter in Section 6.8.

6.2 System Model

Let us consider a Chinese restaurant with J tables numbered 1, 2, ..., J and N cus-tomers labeled with 1, 2, ..., N . Each customer requests for one table for having a meal.

Each table has infinite seats, but may be in different size. We model the table sizes of a restaurant with two components: the restaurant state θ and the table size functions {R1(θ), R₂(θ), ..., R_J(θ)}. The state θ represents an objective parameter, which may be changed when the restaurant is remodeled. The table size function Rj(θ) is fixed, i.e., the functions{R1(θ), R₂(θ), ..., R_J(θ)} will be the same every time the restaurant is remod-eled. An example of θ is the order of existing tables. Suppose that the restaurant has two tables, one is of size L and the other is of size S. Then, the owner may choose to number the large one as table 1, and the small one as table 2. The decision on the numbering can be modeled as θ ∈ {1, 2}, while the table size functions R1(θ) and R₂(θ) are given as R₁(1) = L, R₁(2) = S, and R₂(1) = S, R₂(2) = L. Let Θ be the set of all possible state of the restaurant. In this example, Θ ={1, 2}.

6.2.1 Chinese Restaurant Game

We formulate the table selection problem as a game, called Chinese Restaurant Game.

We first denoteX = {1, ..., J} as the action set (tables) that a customer may choose, where x_i ∈ X means that customer i chooses the table xi for a seat. Then, the utility function of customer i is given by U (Rxi, nxi), where nxi is the number of customers choosing ta-ble xi. According to our previous discussion, the utility function should be an increasing function of R_x_i, and a decreasing function of n_x_i. Note that the decreasing

character-istic of U (R_x_i, n_x_i) over n_x_i can be regarded as the negative network externality effect since the degradation of the utility is due to the joining of other customers. Finally, let n = (n₁, n₂, ..., n_J) be the numbers of customers on the J tables, i.e., the grouping of customers in the restaurant.

As mentioned above, the restaurant is in a state θ ∈ Θ. However, customers may not know the exact state θ, i.e., they may not know the exact size of each table before re-questing. Instead, they may have received some advertisements or gathered some reviews about the restaurant. The information can be treated as some kinds of signals related to the true state of the restaurant. In such a case, they can estimate θ through the available information, i.e., the information they know and/or gather in the game process. Therefore, we assume that all customers know the prior distribution of the state information θ, which is denoted as g₀ ={g0,l|g0,l = P r(θ = l), ∀l ∈ Θ}. The signal each customer received si ∈ S is generated from a predefined distribution f(s|θ). Notice that the signal quality may vary, depending on how accurate the signal can reflect the state. A simple example is given as follows. Considering a signal spaceS = {1, 2} and the system state space Θ ={1, 2}. Then, we define the signal distribution as follows:

P r(s = θ|θ) = p , P r(s ̸= θ|θ) = 1 − p, 0.5 ≤ p ≤ 1. (6.1)

In such a case, the parameter p is the signal quality of this signal distribution. When p is higher, the signal is more likely to reflect the true system state.

在文檔中以賽局理論為基礎的無線網路資源管理機制 (頁 175-178)