The main content of the thesis is divided into three parts. In Part I (Chapter 2 and 3) we review the fundamental mathematical tools and provide a survey on cognitive radio networks. Part II (Chapter 4 and 5) provide two application examples of fully distributed
learning in distributed resource management. Part III (Chapter 6 and 7) studies the case of distributed learning with partial cooperation. The following is an overview of each chapter.
Chapter 2. This chapter introduces the different concepts that will be used throughout the thesis, together with the fundamental mathematics. The basic ideas in game theory is first reviewed. This problem is formulated as a non-cooperative game. The existence and multiplicity of the Nash equilibrium (NE) solution will be investigated for two different network models. In the second part of this chapter, the stochastic learning algorithm is explained in detail. We give the structure of SLA, and present several update rules. At the end, we show that under certain conditions, the SLA converges to NE.
Chapter 3. The first three examples in this thesis are all related to the spectrum access behaviors of cognitive radio networks (CRNs). Therefore, before entering the examples, we open up one chapter to review the previous works on CRNs. The spectrum access in CRNs is classified as different models according to the way the spectrum is granted to the secondary users. Then the representative works of each model are summarized.
Chapter 4. This chapter presents the first application: the network selection problem in cognitive heterogeneous networks (HetNets) where multiple radio access technologies (RATs) coexist. We formulate the network selection problem as a non-cooperative game where the secondary users (SUs) are the players. In particular, under a cognitive access scenario, the availability of channels for SUs depends on the traffic demands of PUs, and is considered as the time-varying external state. With a reasonably designed utility function, we prove that the game is an OPG. SLA is adopted and each SU’s strategy progressively evolves toward the Nash equilibrium (NE) based on its own action-reward history, without the need to know actions in other SUs. The convergence property and the performance in terms of throughput and fairness are again shown through simulations.
Chapter 5. As the second application example of SLA, this chapter studies the spectrum trading in CRNs. Different from the first example, now the licensed spectrum opportun-ities are sold to multiple unlicensed secondary users by multiple service providers. The spectrum trading is modeled as a multi-leader multi-follower Stackelberg game with two
levels of competition. In the lower-level competition, each secondary user selects a ser-vice provider with time-varying channel availability. The serser-vice selection is determined by the prices and the quality of service, which depends on the number of residual chan-nels and the behavior of other secondary users. In the upper-level competition, service providers adjust their pricing strategies to maximize their individual revenues. We fur-ther propose decentralized, stochastic learning-based algorithms for both levels, where a player’s strategy progressively evolves toward the Nash equilibrium (NE) based on its own action-reward history without information of other players’ actions. The convergence properties of the proposed algorithms toward NE points are theoretically and numerically verified. The proposed method demonstrates good utility and fairness performances for the secondary users as compared to other service selection schemes.
Chapter 6. The third example considers channel assignment in OFDMA-based two-tier distributed networks. The secondary users are formulated as the players, and the strategy is the channel assignment. There are two major difference from the previous examples.
Firstly, unlike the previous examples where a resource unit is granted by the owner to a specific user, here we consider the case that all users access the same spectrum. On top of that, an interference mitigation game is formed. Secondly, each player is allowed to know the action of its neighbors. In this way, a proper utility function can be defined, and the channel assignment problem is formulated as an ordinal potential game which has at least one pure-strategy Nash equilibrium (NE). Then the stochastic learning algorithm discussed in Chapter 2 is applied. The convergence property toward pure strategy NE points is verified through system-level simulations. In addition, performance evaluation is carried out by comparing the proposed algorithm with other methods.
Chapter 7. The last example addresses the joint processing and distributed channel assignment in network MIMO systems. The cooperative frequency reuse among base stations (BSs) can improve the system spectral efficiency by reducing the intercell in-terference (ICI) through channel selection and precoding. We presents a game-theoretic study of channel selection for realizing network MIMO operation under time-varying wire-less channel. We propose a new joint precoding scheme that carries enhanced interference
mitigation and capacity improvement abilities for network MIMO systems. We formulate the channel selection problem as a noncooperative game with BSs as the players, and show that our game is an exact potential game (EPG) given the proposed utility function. A de-centralized, stochastic learning-based algorithm is proposed where each BS progressively moves toward the Nash equilibrium (NE) strategy based on its action-reward history and not actions taken by others. The convergence properties of the proposed learning algorithm toward a pure-strategy NE point are theoretically shown and numerically veri-fied for different network topologies. The proposed learning algorithm also demonstrates a fine capacity and fairness performance as compared to other schemes through extensive link-level simulations.