Simulation Results: Convergence - 以賽局理論為基礎的無線網路資源管理機制

We study a D2D communication scenario to examine the efficiency of the proposed framework and the T-REX mechanism in allocating RBGs. We consider a system with one BS governing a typical 19 hexagonal service area topology. The length of the service area edge is 100m. The system has M dedicated RBGs. Each service area has N D2D pairs using a transmission power of 23dBm. Their locations are uniformly given with in the area. We initiate the RBG assignment by randomly allocating N RBGs to those N D2D pairs in each service area. For the signal loss model, we apply the D2D outdoor-to-outdoor path loss model and antenna gain suggested in 3GPP LTE-Advanced Release 12 contributions [3].

The preference of RBGs is based on the D2D pairs' measurements on the interference in each RBG. The interference is mainly caused by those D2D pairs served in different areas. We assume that the D2D pairs in the same area measure the interference at the same time. Therefore, their measurements do not include the interference from the D2D pairs in the same area.

Then, we simulate the system in multiple rounds. In each round, we randomly choose a service area and apply the simulating resource allocation mechanism. If the applied mechanism does not alter the allocation in all service areas, the simulation goes to the next round and the process repeats. The loop process terminates when there exists no service area that would like to alter its allocation under the simulating mechanism.

6 7 8 9 10 11 12 13 14 15

Number of D2D pairs per Area

Average Interference per D2D pair (dBm)

Random

(a) Number of D2D Pairs per Area

6 7 8 9 10 11 12 13 14 15

Average Interference per D2D pair (dBm)

Random

We measure the average system interference experienced by D2D pairs under the T-REX mechanism with two strategy-proof preferences (RAN and DRAN), as well as the greedy cycle-complete preferences (CYC). We compare the T-REX mechanism with the Random, Local Exchange, and Couple Only mechanisms. The Random mechanism, in which all RBGs are randomly assigned to D2D pairs, is considered as a baseline with no optimization applied. In Local Exchange mechanism, D2D pairs exchange with the RBGs held by other D2D pairs only. This can be implemented in a distributed way through the D2D-Tirggered mode in our framework or by using TTCA algorithm. It represents the case that the eNodeB generally is not involved in the resource re-allocation. In the Cou-ple Only mechanism, an exchange occurs only when both pairs have lower interference immediately after the exchange. This mechanism represents the case that D2D pairs have very limited information about the preferences of other pairs and therefore it is not pos-sible to have an exchange sequence with more than two pairs involved. We measure the efficiency of each mechanism with the interference experienced by D2D pairs in their

possessed RBGs.

We first evaluate the impact of number of D2D pairs on the efficiency of the T-REX mechanism. We preserve 15 RBGs and adjust the number of D2D pairs from 6 to 15 in the simulation. The simulation results are shown in Fig. 5.5(a). We first observe that the T-REX mechanism significantly outperforms the Random, Local Exchange, and Couple Only mechanisms in terms of average interference. The Couple Only mechanism has a little improvement in interference compared to the Random mechanism, while the Local Exchange mechanism has better performance than the Couple Only one.

Nevertheless, all T-REX mechanisms perform much better because of the granting of unallocated RBGs from the BSs to the D2D pairs through traders. We also observe that all T-REX RAN, DRAN, and CYC mechanisms perform similarly. This suggests that when the T-REX mechanism converges, the resulting allocation is very close to (or is) the opti-mal one. Additionally, the strategy-proof T-REX RAN and DRAN mechanisms perform equally well with non-strategy-proof T-REX CY mechanism in terms of interference. We also observe that when the number of D2D pairs increases, there is an increase in the inter-ference level under all T-REX mechanisms. The increase is due to the decrease of RBGs preserved by the BS. Since there are fewer RBGs for exchange, the room for improve-ments through the exchange is smaller. Additionally, we also observe that as the number of D2D pairs increases, the effect of trader preference functions becomes insignificant.

This result comes from the decrease in available RBGs in the BS.

Finally, we simulate with 6 D2D pairs and adjust the number of available RBGs from 6 to 15. The results are shown in Fig. 5.5(b). We observe that when the number of available RBGs increases, the interference decreases in all schemes, and the interference mitigation of the T-REX mechanism from the Random scheme increases.

5.6.2 Convergence Rounds

Next, we compare the average convergence rounds of different mechanisms in the sim-ulations. The results are shown in Fig. 5.6. First, the Couple Only mechanism converges in less than 15 rounds in all simulations since there is only a limited number of D2D pairs

6 7 8 9 10 11 12 13 14 15 0

50 100 150

Number of D2D pairs per Area

Number of Rounds

(a) Number of D2D pairs per Area

6 7 8 9 10 11 12 13 14 15

that can exchange with this mechanism. For the Local mechanism, on the other hand, the convergence rounds increase to around 40∼ 60 rounds, which is significantly higher than those of the Couple Only mechanism. In return, her performance is also much better than that of the Couple Only mechanism, as we have seen in Fig. 5.5.

Then, we observe that T-REX mechanisms with different trader preference functions have different numbers of convergence rounds even if they perform similarly in terms of interference mitigation (Fig. 5.5). T-REX RAN has the largest number of convergence rounds in all simulations. This is due to the fact that inter-trader exchanges, which only occur in RAN, reduces the probability that a D2D pair possesses her desired RBG from the BS. This significantly slows down the convergence speed. For others, the T-REX DRAN and T-REX CYC mechanisms have similar numbers of convergence rounds. Additionally, the greedy cycle-complete method in T-REX CYC mechanism leads to less number of convergence rounds since no random process is involved in the T-REX CYC mechanism.

1 2 3 4 5 6 7 8 9 10

−53.06

−53.05

−53.04

−53.03

−53.02

−53.01

−53

−52.99

D2D Pair ID

Interference (dBm)

T−REX PRI T−REX DRAN

Figure 5.7: Simulation Results: Prioritization

5.6.3 Prioritization using T-REX PRI mechanism

Finally, we illustrate the prioritization effect when using PRI preference in the T-REX mechanism. We simulate the D2D system with 8 D2D pairs per area and 15 available RBGs. We assume that the D2D pairs in each area are prioritized according to their num-bering, that is, D2D pair 1 has the highest priority while D2D pair 8 has the lowest priority in their area. We simulate with T-REX DRAN and PRI mechanisms, and the average in-terference experienced by D2D pairs with different priorities are shown in Fig. 5.6.3.

We observe that those D2D pairs with higher priorities (lower numbers) indeed expe-rience lower average interference when T-REX PRI mechanism is applied. For T-REX DRAN mechanism, there is no significant difference between D2D pairs in terms of av-erage interference. In conclusions, the T-REX mechanism with PRI preference indeed prioritized the D2D pairs by offering better RBGs to D2D pairs with higher priorities.

5.7 Related Work

Regarding resource allocation in D2D communications within a cellular system, Yu [73] proposed resource sharing modes and the corresponding closed form solutions to de-termine the optimized resource allocation for D2D communication underlying cellular net-works. Fodor [7] demonstrated the major difficulties in D2D transmission design, in view of peer discovery and resource allocation. When D2D devices co-exist with traditional

cellular ones, the resources for D2D devices should be carefully allocated to minimize in-terference. In such an approach, Wang [74] presented a resource sharing scheme allowing D2D UEs to reuse resources from multiple cellular users. Zulhasnine [75] proposed an algorithm to assign D2D devices to shared resource blocks with acceptable interference.

Zhu [76] presented an algorithm to maintain tolerable interference among D2D UEs shar-ing different RBGs. Chen [77] investigated the coexistence of D2D and cellular users given partial frequency reuse. The interference limited area is proposed to limit mutual interference.

In addition, the D2D transmission may be virtual, relayed by the eNodeB, or direct.

The mode selection further complicates the problem. Janis [78] proposed a method to al-locate resources and assign transmission modes to D2D devices with limited interference to cellular ones. Belleschi [79] proposed a single-cell formulation for D2D communica-tion, in which a D2D device may adopt the D2D or cellular modes to minimize the over-all power. A load-control algorithm was introduced to approximate the optimal solution for the NP-hard formulation. Nevertheless, most existing works only consider centralized schemes with eNodeB assigning the radio resource for D2D devices. In such an approach, periodic or on-demand report of channel status from D2D devices to eNodeB are required, since the interference perceived by D2D devices is unknown to eNodeB. However, most of theexisting literature does not address the issues caused by the rationality of D2D de-vices and users, such as truth-telling. As we have illustrated in Section 5.1, rational D2D devices and users may untruthfully report their information and behave maliciously in order to achieve better performance for themselves. When rationality is a concern, the mechanisms proposed above may receive forged information from the D2D devices and therefore be unable to make correct decisions.

There exist few works on tackling the truth-telling problem in D2D communications.

Xu [80] formulated a sequential second-price auction for the D2D resource allocation.

Users' payoff is maximized and the system sum rate is improved using the proposed re-source allocation algorithm. Nevertheless, such approaches require a monetary transfer process, which significantly increases the complexity of implementation in the cellular

network. It may be preferable to have a direct mechanism involving no payment process.

5.8 Summary

In this chapter, we proposed a novel resource-exchange-based D2D resource allocation framework for an LTE - Advanced system. We showed that the convergence of any algo-rithm in the framework is guaranteed when all performed exchanges are beneficial. Based on the idea of beneficial exchange, we proposed the Trader-assisted Resource Exchange (T-REX) mechanism. The T-REX mechanism identifies the beneficial exchanges through analysing the corresponding exchange graph. The eNodeB participates in the exchange process through designing the trader preference functions. This design is critical to the convergence speed, as has been shown in the simulations. Through game-theoretic analy-sis, we also proved that when the trader preference functions are properly designed, the T-REX mechanism is strategy-proof. This prevents the eNodeB from receiving forged CQI reports from rational D2D devices and users. Finally, we evaluated the performance of the T-REX mechanism through simulations. The simulations with the parameters suggested in the latest 3GPP technical contribution showed that the T-REX mechanism significantly mitigates the interference experienced by D2D devices.

Chapter 6 Chinese Restaurant Game: Social Learning vs. Network Externality

6.1 Introduction

How agents in a network learn and make decisions is an important issue in numerous research fields, such as social learning in social networks, machine learning with commu-nications among devices, and cognitive adaptation in cognitive radio networks. Agents make decisions in a network in order to achieve certain objectives. However, the agent's knowledge on the system may be limited due to the limited ability in observations or the external uncertainty in the system. This impaired his utility since he does not have enough knowledge to make correct decisions. The limited knowledge of one agent can be expanded through learning. One agent may learn from some information sources, such as the decisions of other agents, the advertisements from some brands, or his experience in previous purchases. In most cases, the accuracy of the agent's decision can be greatly enhanced by learning from the collected information.

6.1.1 Traditional Social Learning

The learning behavior in a social network is a popular topic in the literature. Let us consider a social network in an uncertain system state. The state has an impact on the

agents' rewards. When the impact is differential, i.e., one action results in a higher reward than other actions in one state but not in all states, the state information becomes critical for one agent to make the correct decision. In most social learning literature, the state information is unknown to agents. Nevertheless, some signals related to the system state are revealed to the agents. Then, the agents make their decisions sequentially, while their actions/signals may be fully or partially observed by other agents. Most of existing works [81--84] study how the believes of agents are formed through learning in the sequential decision process, and how accurate the believes will be when more information is revealed.

One popular assumption in traditional social learning literature is that there is no network externality, i.e., the actions of subsequent agents do not influence the reward of the former agents. In such a case, agents will make their decisions purely based on their own believes without considering the actions of subsequent agents. This assumption greatly limits the potential applications of these existing works.

6.1.2 Network Externality

The network externality, i.e., the influence of other agents' behaviors on one agent's reward, is a classic topic in economics. How the relations of agents influence an agent's be-havior is studied in coordinate game theory [85]. When the network externality is positive, the problem can be modeled as a coordination game: When one agent makes a decision, the subsequent agents are encouraged to make the same decision in two aspects: the prob-ability that this action has the positive outcome increases due to this agent's decision, and the potential reward of this action may be large according to the belief of this agent.

When the externality is negative, it becomes an anti-coordination game, where agents try to avoid making the same decisions with others [86--88]. The negative network ex-ternality plays an important rule in many applications in different research fields. One important application is spectrum access in cognitive radio networks. In spectrum access problem, secondary users accessing the same spectrum need to share with each other. The more secondary users access the same channel, the less available access time or higher interference for each of them. In this case, the negative network externality degrades the

utility of the agents making the same decision. As illustrated in [89], the interference from other secondary users will degrade a secondary user's transmission quality and can be con-sidered as the negative network externality effect. Therefore, the agents should take into account the possibility of degraded utility when making the decisions. Similar character-istics can also be found in other applications, such as service selection in cloud computing and deal selection in Groupon website.

The combination of negative network externality with social learning is difficult to an-alyze. When the network externality is negative, the game becomes an anti-coordination game, where one agent seeks the strategy that differs from others' to maximize his own reward. Nevertheless, in such a scenario, the agent's decision also contains some infor-mation about his belief on the system state, which can be learned by subsequent agents through social learning algorithms. Thus, subsequent agents may then realize that his choice is better than others, and make the same decision with the agent. Since the net-work externality is negative, the information leaked by the agent's decision may impair the reward the agent can obtain. Therefore, rational agents should take into account the possible reactions of subsequent players to maximize their own rewards.

6.1.3 Chinese Restaurant Game

Chinese restaurant process, which is a non-parametric learning methods in machine learning [90], provides an interesting non-strategic learning method for unbounded num-ber of objects. In Chinese restaurant process, there exists infinite numnum-ber of tables, where each table has infinite number of seats. There are infinite number of customers entering the restaurant sequentially. When one customer enters the restaurant, he can choose either to share the table with other customers or to open a new table, with the probability being predefined by the process. Generally, if a table is occupied by more customers, then a new customer is more likely to join the table, and the probability that a customer opens a new table can be controlled by a parameter [91]. This process provides a systematic method to construct the parameters for modeling unknown distributions. Nevertheless, the behavior of customers in Chinese restaurant game is non-strategic, which means they

follow predefined rules without rational concerns on their own utility. We observe that if we introduce the strategic behaviors into Chinese restaurant process, the model can be a general framework for analyzing the social learning with negative network externality.

To the best of our knowledge, no effort has been made to bring rationality concerns into such a decision making structure in the literature.

By introducing the strategic behavior into the non-strategic Chinese restaurant process, we propose a new game, called Chinese Restaurant Game, to formulate the social learn-ing problem with negative network externality In our previous work [92], we have studied the simultaneous Chinese restaurant game without social learning where customers make decisions simultaneously. In this chapter, we will study the sequential Chinese restaurant game with social learning where customers make decisions sequentially. Let us consider a Chinese restaurant with J tables. There are N customers sequentially requesting for seats from these J tables for having their meals. One customer may request one of the tables in number. After requesting, he will be seating in the table he requested. We assume that all customers are rational, i.e., they prefer bigger space for a comfortable dining experience.

Thus, one may be delighted if he has a bigger table. However, since all tables are available to all customers, he may need to share the table with others if multiple customers request for the same table. In such a case, the customer's dining space reduces, due to which the dining experience is impaired. Therefore, the key issue in the proposed Chinese restaurant game is how the customers choose the tables to enhance their own dining experience. This model involves the negative network externality since the customer's dining experience is impaired when others share the same table with him. Moreover, when the table size is unknown to the customers, but each of them receives some signals related to the table size, this game involves the learning process if customers can observe previous actions or signals.

In the rest of this chapter, we first provide detailed descriptions on the system model of Chinese restaurant game in Section 6.2. Then, we study the sequential game model with perfect information to illustrate the advantage of playing first in Section 6.3. In Section 6.4, we show the general Chinese restaurant game framework by analyzing the learning

behaviors of customers under the negative network externality and uncertain system state.

We provide a recursive method to construct the best response for customers, and discuss the simulation results in Section 6.5. In Section 6.6, we illustrate how the traditional spectrum access problem can be formulated as a Chinese restaurant game. Finally, we summarize this chapter in Section 6.8.

6.2 System Model

Let us consider a Chinese restaurant with J tables numbered 1, 2, ..., J and N cus-tomers labeled with 1, 2, ..., N . Each customer requests for one table for having a meal.

Each table has infinite seats, but may be in different size. We model the table sizes

在文檔中以賽局理論為基礎的無線網路資源管理機制 (頁 166-0)