The WCDMA cellular system supports integrated services with mixed QoS (quality of services) requirements: real-time services require continuous transmission and is intolerant to time delay, while non-real-time services require bursty transmission and tolerate moderate time delay. An adequate radio resource management (RRM) is required to maximize the system
Chapter 2
Situation-Aware Data Access Manager
Using Fuzzy Q-learning Technique for
Multi-cell WCDMA Systems
capacity and fulfill the complementary QoS requirements. Among many traffic engineering techniques for the RRM, a call admission control method is applied to prevent system over-loading, based on the long-term availability of radio resources. On the other hand, a data access control scheme provides bursty transmission permission for non-real-time services, based on the short-term availability of radio resources.
The main purpose of the data access control scheme in WCDMA systems supporting integrated services is to maximize the throughput of non-real-time services while maintaining the transmission quality of real-time services [1]-[5]. To achieve this goal, dynamic access probability schemes [2]-[4] and a base station-controlled scheduling scheme [5] have proposed.
In these schemes, the residual system capacity for non-real-time services is first estimated and then shared to non-real-time terminals. A single-cell environment was considered in [2]-[4], while a multi-cell environment was studied in [5]. The multi-cell scheme [5] treats the interference generated from other-cell terminals as if from several home-cell terminals, and consequently the multi-cell environment is regarded as a single-cell environment. However, the mutual-affected behavior of radio resource allocation in the multi-cell environment is still not considered. Notably, in the multi-cell WCDMA system, the increment of data transmission power in one cell would cause the interference level to rise in the adjacent cells. If each cell allocates the entire residual capacity for bursty transmission without considering the interference influence from adjacent cells, then the system become overloaded.
The over-loading phenomenon could be alleviated by an appropriate coordination method among cells [6]. Knowing the radio resources of all cells, a centralized data access method for the multi-cell WCDMA system can maximize the system throughput by applying a global optimization method. Unfortunately, the coordination procedure takes a long time to transact the resource information between cells, making practical implementation infeasible. Usually, the data access control scheme operates in the short-term time scale, e.g. frame time, making distributed schemes preferable. Kumar and Nanda [7] proposed a distributed scheme called load and interference-based demand assignment (LIDA). The LIDA is a resource reservation-based scheme which reserves some resources in each cell against the interference variation.
Additionally, LIDA uses the concept of burst admission threshold for high-rate transmission in a cell to avoid excess interference power to adjacent cells, allowing bursty transmission only when the strength difference between the received pilot signals from the home cell and adjacent
cells is larger than the threshold. The effectiveness of this scheme relies on the selection of the reservation threshold, which should be dynamically chosen according to the system loading and the received interference power level.
Additionally, a rate scheduling scheme is also embedded in the data access control scheme to allocate the residual capacities for non-real-time terminals according to a service principle.
Ramakrishna and Holtzman adopted a maximization throughput criterion for the scheduling scheme [8]. This criterion can maximize the system throughput, but may cause the low-class users to suffer from starvation. Alternatively, Jalali, Padovani, and Pankai proposed a proportional fairness criterion [9] for a down link scheduling scheme in a CDMA-HDR (high data rate) system. Their proposed scheme defines a utility function as a ratio of the supported and the average data rates. The supported data rate is determined by the channel condition, while the average data rate is calculated as the window average of the transmitted throughput.
The terminal with the highest utility value transmits data in the next frame time. This algorithm may lead to large transmission delay for some terminals. Additionally, Shakkottai and Stolyar proposed an exponential rule criterion [10] for the another definition of the utility function to strike a good balance between the system throughput and the transmission delay. However, applying the exponential rule to the uplink transmission should consider the terminal’s location factor minimize interference with adjacent cells.
This part proposes a situation-aware data access manager using fuzzy Q-learning technique (FQ-SDAM) for multi-cell WCDMA systems. The proposed FQ-SDAM scheme consists of two parts: fuzzy Q-learning-based residual capacity estimator (FQ-RCE) and data rate sched-uler (DRS). The FQ-RCE, by fuzzy Q-learning, estimates the appropriate situation-dependent residual system capacity, in terms of interference power, for non-real-time services, while the DRS assigns transmission rates for non-real-time terminals by a modified exponential rule.
The fuzzy inference system (FIS) and the reinforcement learning technique have been separately applied to solve network resource management problems [11]-[14]. A fuzzy resource allocation controller was proposed in [12], where the FIS method was adopted to estimate the resource availability. A reinforcement learning technique, Q-learning, was applied respectively to handle dynamic channel assignment in [13] and multi-rate transmission control problems in [14] for wireless communication systems. By learning from the system environment, the Q-learning technique can converge to a pre-defined optimal control target. In [15], Jouffle
proposed a reinforcement learning technique for FIS, called fuzzy Q-learning (FQL). The FQL technique combines the advantages of FIS and reinforcement learning. The FIS provides a good function approximation for the FQL, which enables a priori knowledge to be applied to the system design. Additionally, the reinforcement learning provides a model-free approach to obtain a control target. By applying the FQL technique, the radio resource can be managed under partial, uncertain information, and the optimal resource management can be reached incrementally.
FQ-RCE uses interference measures from three sources as input linguistic variables to estimate the situation-dependent residual capacity in the multi-cell environment: the received interference power from real-time terminals at the home cell, the received interference power from non-real-time terminals at the home cell and the received interference power from the adjacent cells. Notably, the received interference power from adjacent cells is regarded as a different variable from the received interference power from home cell to distinguish the interference variations. Therefore, by the linguistic variable of the adjacent-cell interference power, the RCE at the home cell can perceive the radio resource allocation by those FQ-SDAMs in adjacent cells, or say, be aware of the loading of adjacent cells, and precisely estimate the residual resource in a distributed fashion. Thus, the multi-cell WCDMA environment does not require an explicit action coordination.
On the other hand, the DRS modifies the exponential rule in [10] to assign the transmission rates for non-real-time terminals, based on the residual capacity estimated by FQ-RCE. The modified exponential rule is a utility-function-based scheduling algorithm which considers the transmission delay, average transmission rate, and link capacity. The modified rule differs from the original exponential rule [10] in the link capacity definition. For the modified exponen-tial rule, the link capacity is defined as the maximum available rate where the interference influence on adjacent cells by the transmission power is below a guard threshold, considering location awareness. The modified exponential rule is most suitable for applications in the uplink transmission of multi-cell WCDMA systems, which is explained later. Simulation results show that the proposed FQ-SDAM outperforms the LIDA scheme since it can effectively reduce the packet error probability and improve the aggregate throughput in both homogeneous and non-homogeneous multi-cell WCDMA environments. Additionally, the modified exponential rule can achieve better system performance than the original exponential rule. In the homogeneous
case, FQ-SDAM achieves higher aggregate throughput by 75.3% (53.3%) than LIDA with β=10%, under high-bursty (low-bursty) real-time traffic. In the nonhomogeneous case, FQ-SDAM achieves greater aggregate throughput by 31.53%, 35.5%, and 34.2% for the cells in the central, first-tier, and second-tier, respectively, than LIDA with β=10%.
The rest of this chapter is organized as follows. The system model is described in Section II. Section III briefly describes the concept of fuzzy Q-learning and proposes the design of FQ-SDAM. Simulation results are presented in Section IV, which compares the performance of the FQ-SDAM and a conventional LIDA scheme.