A novel dynamic cell configuration scheme in next-generation situation-aware CDMA networks

(1)

A Novel Dynamic Cell Configuration Scheme in

Next-Generation Situation-Aware CDMA Networks

Ching-Yu Liao, Member, IEEE, Fei Yu, Member, IEEE, Victor C. M. Leung, Fellow, IEEE, and

Chung-Ju Chang, Senior Member, IEEE

Abstract—To balance the time-varying traffic load between cells, caused by user mobility and diverse applications, it is crucial for next-generation code-division multiple-access (CDMA) cellular networks to configure cell coverage and capacity dynamically. In this paper, we show that pilot power allocation is highly coupled to other facets of radio resource management. We propose a novel dynamic cell configuration scheme for multimedia CDMA cellular networks, based on reinforcement-learning, which takes into account pilot, soft handoff, and maximum link power alloca-tions, as well as call admission control mechanisms. Simulation results demonstrate the effectiveness of the proposed scheme in situation-aware CDMA networks.

Index Terms—Code-division multiaccess, land mobile radio cel-lular systems, Markov processes.

I. INTRODUCTION

T

HE GROWING popularity of multimedia Internet appli-cations is a strong driving force for future cellular mo-bile systems. Due to user mobility and wide range of applica-tions, the traffic pattern of each cell can vary dynamically. Thus, the current practice of engineering cell coverage and capacity based on predefined traffic patterns before a code-division mul-tiple-access (CDMA) cellular network is deployed may lead to poor utilization of radio resources. Due to asymmetric traffic and the interdependence of traffic capacity and coverage, this problem could be exacerbated in next-generation CDMA cel-lular networks, especially over the capacity-limited downlink [1]–[4].

To adapt to the variations of traffic load, tradeoffs between coverage and capacity in CDMA cellular systems have been considered [3]–[7]. For example, to guarantee the coverage of a cell, more power is used to reach mobile stations (MSs) near cell boundaries under power control. However, in interference-lim-ited systems, the resulting higher intercell interference will re-duce the system capacity significantly. Furthermore, under large

Manuscript received October 2, 2004; revised June 4, 2005. This work was supported in part by the Graduate Student Study Abroad Program, National Science Council, Taiwan, under Contract NSC 92-2917-1-009-006 and Con-tract NSC 93-2219-E-009-011. This paper was presented in part at the IEEE Vehicular Technology Conference, Stockholm, Sweden, Spring 2005.

C.-Y. Liao is with Telecordia Applied Research Center Taiwan Company, Taipei 115, Taiwan (e-mail: [email protected]; chingyu@ ece.ubc.ca).

F. Yu and V. C. M. Leung are with University of British Columbia, Vancouver, BC 1890, Canada (e-mail: [email protected]; [email protected]).

C.-J. Chang is with the Department of Communication Engineering, National Chiao Tung University, Hsinchu 300, Taiwan, R.O.C. (e-mail: [email protected]).

Digital Object Identifier 10.1109/JSAC.2005.858887

traffic variations, power control may not be effective [3]–[5]. A uniform network layout with equal-sized cells, while optimal under uniform traffic, suffers significant capacity degradations if traffic loads are not balanced among all the cells [6]. To ac-commodate traffic load variations between cells, it is crucial for next-generation CDMA cellular networks to be aware of system situations and configure cell coverage and capacity dynamically [1], [7].

Several schemes for dynamic cell configuration (DCC) have recently been proposed [8]–[16]. Optimization of pilot power, and downlink capacity and coverage planning were considered in [8] and [9]. In [10], a DCC scheme for circuit-switched mi-crocellular CDMA systems was proposed to enhance the uplink performance. In [11], the competitive characteristics of network coverage and capacity were analyzed for a simple network. Only one class-of-service was considered in [8]–[11], and it may be difficult to extend these schemes to multiple classes of service. Some techniques based on heuristics have also been proposed for dynamic pilot power allocation (DPPA) to balance downlink traffic load, while assuring service coverage [12], [13]. How-ever, these schemes may cause “coverage failure regions” be-tween cells where pilot signals are too weak to serve a MS [14], [15]. Moreover, a common shortcoming of the previous work [8]–[15] is that only pilot power is adjusted dynamically in the time-varying environment, without adjusting other parameters critical to radio resource management (RRM).

In fact, pilot power allocation and other RRM parameters are tightly coupled. In our previous work [16], we have shown that system performance can be improved significantly by a self-or-ganized DCC scheme with coordinated call admission control (CAC), compared with fixed pilot power allocation (FPPA) and DPPA without taking CAC into account. Other work has shown that signal quality degradation can be prevented by configuring cell areas adaptively and setting power levels appropriately [4], [17], and soft handoff has significant impacts on the system capacity and cell coverage [18], [19]. Therefore, an effective mechanism, link proportional power allocation (LPPA), was proposed for downlink soft handoff in [20] and [21]. It was shown that LPPA can enhance system capacity in CDMA cellular systems with mixed-size cells, compared with conven-tional site-selection diversity transmissions (SSDT) scheme [22].

In this paper, we show that DPPA without changing other related RRM parameters accordingly can result in performance degradations. To address this problem, we propose a novel DCC scheme based on reinforcement-learning called DCC-RL. The novelties are as follows.

(2)

Fig. 1. Power allocation in downlink CDMA systems. (a) Fixed pilot scheme. (b) Dynamic pilot scheme.

1) DPPA is linked with soft handoff power and maximum link power allocations, as well as CAC mechanisms. 2) Reinforcement-learning efficiently tackles optimization

problems with large state spaces and action sets [23] in realistic CDMA multimedia cellular networks, which were previously deemed intractable [24].

3) Our method does not require a priori knowledge of the state transition probabilities associated with the cellular network, which are very difficult to estimate in practice due to the varied propagation environment, diverse multi-media services, and random user mobility.

4) DCC-RL can be implemented in a distributed manner in each base station (BS), minimizing signaling overhead be-tween BSs and radio network controllers, and the number of system states involved in computations.

We compare DCC-RL with fixed cell configuration (FIX) em-ploying FPPA, and DPPA without changing other RRM param-eters. Simulation results show that DCC-RL outperforms the others by increasing the total throughput, decreasing the frame error probability, blocking probability, and handoff forced ter-mination probability with the price of slightly increasing the size of the active set.

The rest of this paper is organized as follows. DCC issues are discussed in Section II. Section III describes the system model. Section IV formulates the DCC problem taking into account of RRM, and presents the proposed DCC-RL scheme. Simulation results are presented and discussed in Section V. Section VI con-cludes the paper.

II. ISSUES OFDYNAMICCELLCONFIGURATION

A. Effects of Pilot Power Allocation Schemes

Since each BS has a finite-transmit power, the pilot and traffic channels have to share the total power. Pilot power allocation can be either fixed or dynamic. In FPPA schemes, which is used in current CDMA systems, about 10%–15% of the total power is allocated to the common pilot channel and is not changed after the deployment of a cellular network, as shown in Fig. 1(a). When the traffic load is too high to allow allocation of suffi-cient power for all MSs, the system performance can degrade severely. Some strategies have to be employed to balance power between cells, e.g., by DPPA, as illustrated in Fig. 1(b). The pilot power can be adjusted between the maximum and minimum

constraints based on various traffic situations. When traffic is light, the pilot power can be increased to extend cell coverage to more MSs. On the other hand, when traffic is heavy and there is insufficient power for allocation to all traffic channels, the pilot power can be decreased to shrink cell coverage. This ex-plains the interdependence of coverage and capacity in CDMA cellular systems.

Moreover, in future CDMA networks, diverse multimedia traffic and random user mobility will make preplanning of cov-erage and capacity difficult to manage. To achieve load balance whenever traffic congestion occurs, DCC through DPPA will be necessary.

B. Effects of Soft Handoff Power Allocation Schemes

The soft handoff mechanism can provide seamless connec-tions and better signal qualities for MSs near the cell bound-aries. Since the limited power available for traffic channels in each BS is shared between nonhandoff and soft handoff MSs, there are tradeoffs between coverage and capacity. For example, a BS may shrink the cell coverage to serve less handoff MSs near the cell boundary, leaving more power available for allocation to nonhandoff MSs with higher transmission rates. As soft handoff mechanisms have direct impacts on cell coverage and capacity, RRM for soft handoff MSs is a challenging issue in CDMA cel-lular systems with mixed-size cells formed by different levels of pilot power [20], [21].

C. Effects of New/Handoff Call Admission Control

In downlink CDMA systems, since each BS has finite-power resource to be shared among MSs, the allocated pilot power and traffic channel power are directly related to the coverage and capacity of the cell. To achieve load balance whenever traffic congestion occurs, DCC through DPPA is necessary. However, it is necessary to consider the pilot power allocation and strate-gies of new/handoff CAC jointly in order to design an effective DCC scheme that improves the system performance, while min-imizing the undesirable effects.

For new call arrivals near cell boundaries, the pilot power de-termines the MSs initial access cells. Therefore, reducing the pilot power of a congested cell causes the MS to request a traffic channel from an adjacent cell. If the MS fails initially to detect a BS with enough signal strength, it cannot make a call request to the system. This is referred as a coverage failure. As a conse-quence, although the new call blocking probability of the con-gested cell could be decreased, the coverage failure probability might be increased.

For ongoing calls near cell boundaries, decreasing or in-creasing the pilot power of a BS can force some of the MSs to handoff into other cell(s) or vice versa. Therefore, the average size of the active set and handoff rates would be increased. In addition, if a MS suffers a bad signal quality and fails to execute the handoff in time, a handoff forced termination occurs.

D. Simulations Illustrating Impacts of Pilot Power Control

In this section, we present simulation results to show that pilot power allocation and other RRM parameters are highly coupled. The simulation environment and parameters are given

(3)

Fig. 2. Total throughput of fixed cell configuration for SSDT and LPPA schemes under uniform( = 1) and nonuniform ( = 4) cell load cases.

in Section V. Fig. 2 shows the system throughput of fixed cell configuration with different pilot power levels under uniform and nonuniform cell traffic load for both SSDT and LPPA soft handoff schemes (FIX-SSDT, FIX-LPPA), where is the ratio of traffic load between the hotspot cell in the center and the surrounding cells. In the simulations, the pilot channel power is set to 1 W in the FPPA scheme, and all other RRM al-gorithms are optimized according to this pilot power setting and then fixed. Subsequently, we simulate DPPA by adjusting the pilot power level, without changing the other RRM parameters. Fig. 2 shows that the system throughput degrades whenever a pilot power other than 1 W (used in the fixed scheme) is used in DPPA. This is because increasing pilot power also increases interference to adjacent cell MSs; the larger the pilot power, the larger the interference, and the lower the throughput. Another reason is that the CAC criterion and the maximum link power constraint remain the same when the pilot power changes cell coverage. For example, when new or handoff calls issue requests to cells with light or heavy traffic, the tight or loose criteria of CAC may result in new call blockings or handoff forced terminations, respectively. Thus, uncoordinated design of pilot power and other RRM strategies can degrade the system performance severely. Fig. 2 also shows that soft handoff power allocation and pilot power allocation are highly coupled, both affecting system throughput. LPPA has larger throughput than SSDT with different pilot power and traffic load distributions. The throughput difference between SSDT and LPPA is larger when the traffic load is nonuniformly distributed

.

III. SYSTEMMODEL

The system block diagram of our proposed DCC-RL scheme is shown in Fig. 3. DCC-RL can be implemented in a distributed manner in each BS, which adjusts its pilot power periodically to adapt to the variations of system situation through the dynamic pilot power controller. Based on the determined pilot power level, themaximumlinkpowerconstraintandCACcriterionareadjusted accordingly. Then, the traffic channel power allocator adjusts its

maximumlinkpowerconstraintthatisobtainedfromthemaximum link power estimator. After applying all updates for RRM to the entire cellular network, the reinforcement signal is input to the dynamic pilot power controller to aid its decision for the next pilot power level. In this section, we describe the signal model and the link budget model in CDMA systems. An initial cell coverage design for the CDMA cellular system is provided to illustrate the interrelation between capacity and cell coverage.

A. Signal Model

Assume the total allocated power of BS is , including pilot channel power and traffic channel power , where is smaller than or equal to the BS’s maximum transmit power

. The pilot power of BS is given by , where is the fraction of the pilot power relative to BS ’s maximum transmit power, constrained between min-imum fraction and maximum fraction . For the traffic channel of MS served by BS , the allocated transmit power

from BS is , where is the fraction of

traffic channel power allocated for transmission to MS ; is the maximum link power of BS . Thus, , where the represents the set of all MSs served by BS .

B. Initial Cell Coverage Design

The initial design of cell coverage can be obtained by link budget analysis. The equivalent isotropic radiated power (EIRP) at a BS’s transmitter, , of each traffic channel can

be calculated by ,

where and are the antenna gain and cable loss of the BS, respectively. Note that the units of the parameters are given in brackets.1_{On the other hand, the EIRP,} _{, measured}

at a MS’s receiver, taking into account the soft gain , the antenna gain , and the body loss of the MS, is . Moreover, assume that the interference margin (maximum planned noise rise) is , and the received noise power (product of thermal noise density, chip rate, and noise figure) is . The receiver sensitivity of the MS given service rate

is , where

is the required signal-to-interference-plus noise (SINR) value for service rate , which is equal to the required bit-energy-to-noise ratio (Eb/No), , minus the pro-cessing gain . From the link budget, the maximum allowable path loss for service rate is

(1) where is the margin for log-normal fading. When a MS is near the cell boundary, the received chip-energy-to-interfer-ence ratio should not fall below the minimum require-ment for service rate , given by

. In general, pilot power is around 1–4 W, which is about 5%–20% of the maximum total transmit power of the BS, .

Based on the allowable maximum path loss and the applied channel model, the resultant cell radius is different with different service rate . For , since

(4)

Fig. 3. System block diagram of proposed DCC-RL scheme.

, therefore, and

. This phenomenon raises the issue of fairness for dif-ferent service rates in terms of service coverage and transmit power. If the same transmit power is allocated to MSs with dif-ferent service rates, the higher service rate results in a smaller service coverage. Alternately, in order to support the same ser-vice coverage for different serser-vice rates, more transmit power is needed to support MSs with higher service rates near cell bound-aries. Note that since total downlink transmit power of each BS is limited, system capacity is directly related to transmit power management. Based on the above concerns, in order to optimize system capacity, cell radius can be determined in terms of a suit-able reference service rate , where . The cor-responding cell radius is determined by the maximum al-lowable path loss . Therefore, the required of the system is equal to

(2) where is within the range from 16 [dB] to 20 [dB].

Based on the link budget and the channel model, the cell radii can be calculated in terms of different reference service

rates and the results represented as

in Fig. 4. Assume that 1 W power is allo-cated to the pilot channel for the FIX configuration with FPPA. Fig. 4 shows results of the total throughput (system capacity) in the cellular network by applying SSDT and LPPA schemes in terms of different referenced service coverage under uniform and nonuniform traffic load cases, where is the traffic load ratio between a central cell and its surrounding cells. It is observed that, in all cases, a smaller cell coverage can

Fig. 4. Capacity versus referenced service coverage under fixed pilot power for SSDT and LPPA schemes under uniform( = 1) and nonuniform ( = 4) cell loads.

increase the total throughput. This is because the lower propa-gation loss in a small cell provides a better signal quality for the MSs. Also, we can see that under nonuniform load , the total throughput increases at a decreasing rate as the reference service rate increases. This means that the interrelation of cell coverage and system capacity becomes more sensitive to the ref-erence service rate when there is unbalanced load between cells. Moreover, the curves flatten when the reference service rate ex-ceeds . This implies that the system has reached its capacity limit. Hence, the initial cell coverage in this simulation platform is set with a cell radius corresponding the reference service rate

(5)

IV. PROPOSEDDCC-RL SCHEME

We formulate the DCC problem as a Markov decision process (MDP) [26]. However, traditional model-based solutions of MDP, such as policy iteration and linear programming, require a prior knowledge of the state transition probabilities. Due to the diverse multimedia traffic and random user mobility, these conventional solutions suffer from the curses of dimensionality

and modeling. As described next, we propose a novel

rein-forcement-learning-based DCC scheme, DCC-RL, to find an optimal policy for pilot power allocation that takes RRM into account (see Fig. 3).

A. Problem Formulation as a Markov Decision Process

In DCC-RL, the BS pilot power is periodically adjusted to adapt to changing conditions. These time instants are called

de-cision epochs and the adjustments of pilot power are called ac-tions in the MDP formulation. The chosen action is based on

the current state of the system. Depending on the action taken by the system, the system can earn rewards. The objective is to optimize the sequence of actions to maximize the accumulated rewards. The detailed formulation is as follows:

• [Decision epochs]: In CDMA systems, the pilot signal is broadcasted from each BS periodically [30]. The state of the system changes accordingly. Therefore, we adjust the pilot power every frames, where is a design parameter.

• [States]: Define the state vector of the system as , where denotes the mean power of the BS and denotes the variance of the power load. Assume there are samples from the measure-ments, where is also a design parameter. Also, and can be obtained from the sample mean and variance, respectively, as follows.

(3)

(4) The decision process can be implemented in each BS in a distributed manner because the variation of the BS’s power load can implicitly reveal the load information about all cells.

• [Actions]: At each decision epoch, the BS makes a deci-sion to choose a suitable fraction of the pilot power based on state . The action of BS is defined as the fraction of the pilot power relative to the maximum transmit power.

• [Rewards function]: Based on the action in a state , the system earns a reward . We choose the total throughput as the reward

(5)

where is the transmission rate of MS

.

B. MDP Solution by Reinforcement-Learning

The objective of the decision process is to find an optimal policy for each state , which minimizes the cumulative measure of the reward that is received over time, where the subscript represents the time instant . The total expected discounted reward over an infinite time horizon can be represented by the value function with policy ,

, with discount factor . Let be the transition probability from state to . The value function can be rewritten as

(6)

where . Define a -function of

state-action pair with policy as

. The optimal value function with the optimal policy satisfies Bellman’s optimality crite-rion [27]

(7) Thus, the optimal -function can be obtained from finding an optimal policy of -function .

Without knowing and , the -learning

process can still find an optimal policy through updating to find in a recursive manner using the information of current state , action , reward , and next state . Watkins [28] has shown that if the -value of each feasible state-action pair is visited infinitely often, and if the learning rate is decreased to zero in a suitable way, then

as . The -values

of the state-action pairs are usually stored in a lookup table. However, this approach is not suitable for problems with con-tinuous state spaces as in multimedia CDMA systems, where the curse of dimensionality is hard to tackle. It has been shown [29] that fuzzy -learning is an efficient technique for the ap-proximation of continuous system states by adapting Watkins’s -learning [28] technique such that a fuzzy inference system (FIS) is incorporated into reinforcement-learning to generalize -learning by inferring both the actions and -functions from fuzzy rules. Taking advantage of the -learning technique, the universal approximation property of the FIS makes the representation of -values with large state-action space pos-sible, and a priori knowledge can be integrated in the learning procedure [16].

Furthermore, in DCC-RL, a simple strategy for feature ab-straction, exploitation, and exploration is applied to speed up the learning procedures (and shorten the convergence time) for ob-taining the optimal solution. A policy feasible action set

can be obtained based on the current state . State can be adopted as an indicator to classify the feasible action sets. For example

if

otherwise (8)

where is the cutting value of the action set,

, and is the threshold of the mean power as the quality-of-service (QoS) constraint. Since a greedy policy can easily cause the system to converge to locally optimal

(6)

solutions, it is necessary to visit all the sets of possible actions for all states to find the globally optimum solution. This is the so-called exploration/exploitation dilemma. An action of state is selected from the feasible action set using an exploitation and exploration policy. Here, a pseudoexhaustive policy is applied, in which the action with the best -value is chosen with a selection probability based on the Boltzmann distribution. Otherwise, an action that is the least visited will be chosen. The resulting action is converted to the pilot power of the BS. The reward can be measured from the system, and fed back to update the -function.

C. Dynamic Maximum Link Power Constraint Design

The main purpose of the adjustment of maximum link power constraint is to couple the pilot power into the design. Note that pilot power adjustment affects the cell coverage, while the max-imum link power of a cell affects the service coverage for MSs with different service rates near the cell boundary. In order to match cell coverage and service coverage, based on the max-imum path loss (1) and the receiver sensitivity in terms of refer-enced service rate , the total EIRP of pilot power should be

(9) where is the receiver sensitivity of the pilot signal such that , where is the required SINR value of the pilot signal, which is equal to the required Eb/No of the pilot signal , minus the processing gain of the pilot signal . Then, substituting (1) into (9),

we obtain .

Hence, as soon as the pilot power of BS , , has been adjusted dynamically, the maximum link power of cell should be

(10) The maximum link power constraint is, thus, coupled with pilot power accordingly. Note that the same constraint of the max-imum link power for different service rates is adopted in this paper because the processing gain can be regarded as a priority index for different service rates.

D. Dynamic CAC Criterion Design

In DCC-RL, as soon as the optimal pilot power has been determined by the dynamic pilot power controller, as shown in Fig. 3, the corresponding maximum link power can be up-dated by (10). The SINR threshold for call admission in cell

becomes

(11) For CAC of new calls, MS originating a new call measures and reports its received SINR . The BS accepts the new call if , otherwise, the new call is blocked. For CAC of handoff calls, the soft handoff algorithm [30] is implemented, in which maximal ratio combining is used to obtain the overall SINR of MS , from all serving BSs in the active set . A handoff request is issued to BS whenever an add event occurs. The BS accepts the handoff request if , and

the admitted handoff MS adds BS into its active set . Oth-erwise, the handoff call request is blocked. On the other hand, if the blocked handoff call has not yet exceeded the handoff delay time, the MS can make a handoff request again as long as the link quality does not fall below the requirement (2).

V. SIMULATIONRESULTS ANDDISCUSSIONS

A simulation model is set up to examine the performance of the DCC-RL scheme in a CDMA cellular system. We first de-scribe the simulation platform, and then the simulation results are presented and discussed.

A. Simulation Model

1) Cell Model: We consider a hexagonal cellular system

with 19 wraparound cells, in which the central cell is a hotspot cell with a high traffic load. As before, the load ratio is defined as the ratio between the call arrival rates in the hotspot cell and in each surrounding cell. Geographically, the cellular deployment is homogenous, and the default cell radii can be determined by the link budget design in Section III-C. The link budget parameters are as follows: ,

, , , ,

, , and .

2) Mobility Model: Assume MSs are uniformly distributed

in each cell, and their initial speeds are uniformly distributed between 0 and the maximum speed. The maximum speeds for MSs in the hotspot cell, first-tier cells, and second-tier cells are assumed to be 30, 60, and 60 km/h, respectively. Whenever a MS moves into a different cell tier, a new speed is chosen ac-cording to the above distribution. Each MS is subject to corre-lated shadowing effect based on the Gudmundson model [30], in which the decorrelation length is 20 m in a vehicular environ-ment. The shadowing effect is updated according to the corre-lated shadowing model, with coverage probability 95%. During each shadowing effect update, with probability 0.2 the moving direction of the MS is changed and a new direction is selected at random among 45 [30].

3) Channel Model: For the channel model [30], the path loss

is obtained by

, where is the distance between the BS and the MS; and are the antenna height of the BS and the downlink frequency, respectively. In our simulations, the down-link frequency is 2.4 GHz and the antenna height is 20 m.

4) Traffic Model: Poisson call arrivals are assumed. Three

service classes including real-time voice, real-time data, and nonreal-time data, are considered in the system. In the simu-lations, the fractions of voice, real-time data, and nonreal-time data traffic are 60%, 35%, and 5%, respectively. A two-level Markov modulated Poisson process (MMPP) is used to model voice traffic, while a five-level MMPP is used to model real-time data traffic. The mean duration of each state in the five-level MMPP is 1 s. The call holding times of real-time voice and data services are exponentially distributed with means 60 and 30 s, respectively. The transmission rate and required Eb/No of the voice traffic are 12.2 kb/s and 5 dB, respectively. The service rates of the data traffic are 16, 32, 64, and 144 kb/s, and the

(7)

corresponding Eb/No requirements are 5, 4, 3, and 2 dB. Note that adaptive rate transmission is applied whenever the power resources are not enough to support the existing MSs. For the nonreal-time data service, variable length data bursts are as-sumed to be geometrically distributed with a mean burst size of 200 frames. Moreover, there are six different service rates: 16, 32, 64, 144, 384, and 512 kb/s, which require Eb/No of 5, 4, 3, 2, 1.5, and 1 dB, respectively. The transmissions are on a burst-by-burst basis.

B. Performance Measurements and Discussions

We compare the performance of four schemes: FPPA with SSDT (FIX-SSDT), FPPA with LPPA (FIX-LPPA), DCC-RL with SSDT (DCC-SSDT), and DCC-RL with LPPA (DCC-LPPA). For FPPA, the default pilot power, , is set at 2.5 W (12.5% of the maximum transmit power) for each cell. The maximum link power and the CAC threshold are fixed and calculated from (10) and (11), respectively. For DCC-RL, , , and are adjusted dynamically, as described in Section IV. Assume the arrival rate is 1.6 calls/s, and the traffic load ratio is varied from 1 to 5. For the design parameters of DCC-RL, maximum and minimum fractions of

pilot power are and , respectively;

decision period is ten frames; total number of measurement samples is 100 frames; and total simulation time is 10 frames (10 learning times).

The comparison between FIX-LPPA and FIX-SSDT in terms of capacity and coverage is shown in Fig. 4. We see that the FIX-LPPA scheme achieves a higher total throughput than the FIX-SSDT scheme for both uniform and nonuniform cell load cases. The throughput of FIX-LPPA is about 20% higher than that of the FIX-SSDT scheme in the nonuniform cell load case. This is because FIX-LPPA successfully releases congested cell’s load through a power-balance strategy, whereas the FIX-SSDT scheme lacks the flexibility to adapt to nonuniform cell load situations.

Fig. 5(a) and (b) shows the average pilot power distribution of the hotspot, first-tier, and second-tier cells using DCC-LPPA and DCC-SSDT schemes, respectively. We can see that the DCC-RL schemes adjust the pilot power in each cell according to various system situations. When the traffic load ratio is increased, the pilot power of the hot spot cell is reduced aggressively so as to balance traffic load with adjacent cells, but the coverage is shrunk accordingly. In this way, the BS of the hotspot cell can save its transmit power to serve new call arrivals. Besides, adjustments of the pilot power can make the existing MSs near the cell boundary enter soft handoff mode so as to balance traffic load. Furthermore, for the hotspot cell, the slope of the pilot power level versus traffic load ratio for DCC-SSDT is sharper than that for DCC-LPPA. This is because both DCC and LPPA strategies are helpful for power balancing, so that the pilot power of the DCC-LPPA scheme does not have to be adjusted aggressively.

Fig. 6(a) and (b) shows the new call blocking probability of real-time and nonreal-time services, respectively. We can see that the DCC-RL schemes improve the blocking probability of both real-time and nonreal-time services relative to the FIX schemes. In order to achieve power balance between cells,

Fig. 5. Average pilot power of hotspot, first-tier, and second-tier cells for (a) LPPA scheme and (b) SSDT scheme under FIX and DCC-RL.

Fig. 6. Comparison of blocking probability of (a) real-time and (b) nonreal-time services.

DCC-RL adjusts pilot power and coordinates other RRM mechanisms dynamically. This is the reason why the DCC-RL schemes can save more power resource to accommodate new call requests. Performance results of DCC-LPPA and DCC-SSDT schemes without adapting other RRM parameters are also presented for comparison. We can see that the DCC-RL schemes with fixed RRM parameters have worse new call blocking performance than the FIX schemes, as explained in Section II-D. Similarly, impaired results in handoff forced termination occur when a MS fails to add new BSs into its active set and suffers degraded channel quality, as shown in Fig. 7. This is because existing MSs near the cell boundaries often suffer bad transmission quality, and they may be dropped when power is not enough for admitting handoff requests. On the other hand, compared with the FIX schemes, the proposed DCC-RL schemes can improve handoff forced termination probabilities greatly.

(8)

Fig. 7. Comparison of handoff forced termination probability.

Fig. 8. Comparison of average total throughput.

Fig. 8 shows the total throughput of the system. In the FIX cases, FIX-LPPA outperforms FIX-SSDT. When the traffic load ratio is higher, the throughput of FIX-SSDT degrades sharply because of the inefficient handoff power allocation strategy. With FIX-LPPA, the average throughput keeps fairly constant when traffic load ratio is less than 4. Compared with the FIX schemes, the DCC-RL schemes improve the average throughput when the traffic load ratio is increased. This is because DCC-RL can dynamically balance traffic load between cells through pilot power adjustments based on system situa-tions, as well as CAC criterion and the maximum link power constraint.

Furthermore, Fig. 9 compares the average frame error rates. We observe that DCC-RL can keep the frame error rate roughly under the requirement of 0.01 by the simple feature abstraction design. A more sophisticated design of the feature abstrac-tion can guarantee the QoS requirement of the frame error rate strictly. It is noteworthy that the frame error rates of the DCC-RL schemes are worse than those of FIX-LPPA in some cases. This is because the DCC-RL schemes can make more

Fig. 9. Comparison of frame error probability.

Fig. 10. Comparison of size of the active set.

efficient use of the total power resource to provide MSs with a good enough QoS that is just within the system requirement of a 0.01 frame error rate. Though FIX-LPPA can provide a better frame error rate than DCC-RL schemes when the traffic load ratio is high, the corresponding system throughput is lower resulting in poor new call blocking probability and handoff forced termination probability. The complementary results for system performance as described above give important insights in the design of downlink CDMA cellular systems.

In order to balance traffic loads between cells, DCC-RL can reduce or increase pilot power aggressively. Power balancing can be achieved by forcing MSs near the cell boundary into handoff mode. Therefore, the average size of the active set and handoff rates can be increased, as shown in Fig. 10. It is found that the DCC-RL schemes cause slight increases in soft handoff events. Furthermore, Table I shows the coverage failure proba-bility. A coverage failure occurs when a MS starting a new call fails to detect a good enough signal from a BS. The DCC-SSDT and DCC-LPPA schemes cause slightly higher coverage failure probabilities than the FIX schemes. This is because even though

(9)

TABLE I

AVERAGECOVERAGEFAILUREPROBABILITY

DCC works to balance traffic load through pilot power adjust-ments so as to reduce the interference of the hotspot cell, MSs near the cell boundary may suffer bad signal strengths from all BSs in the active set. Because of the tradeoff between capacity and coverage, we stress that coverage failure is an inevitable downside of any kind of DCC-RL scheme. The goal is to re-duce the impact of this drawback through performance gains in system throughput, new call blocking probability, and handoff forced termination rate, etc. Due to the maximum power con-straint in each BS, the system shows a performance tradeoff be-tween coverage failure and call admission blocking. In a cel-lular system under heavy traffic load, a new call request could fail either due to coverage failure, or due to blocking by CAC. Since our results show that the performance gain in reduced call blocking more than offsets the performance loss in increased coverage failure, our proposed DCC-RL can give an overall gain in system performance, and the goal stated above is successfully achieved.

VI. CONCLUSION

In this paper, we have studied the DCC problem in next generation CDMA networks, and proposed a model-free rein-forcement-learning solution, DCC-RL, to solve the problem. DCC-RL can dynamically configure cell coverage and capacity based on the varying situations of the system. Simulation results show that pilot and soft handoff power allocations, maximum power constraint design, and the admission control criterion are highly coupled and should be considered jointly. Results also show that DCC-RL significantly increases the system throughput compared with conventional fixed pilot schemes. Furthermore, combining DCC-RL with LPPA gives the ad-vantage of power balancing for soft handoff so that the system capacity of the DCC-LPPA scheme outperforms conventional FIX-SSDT scheme significantly. The proposed DCC-RL so-lution gives a design framework suitable not only for the next-generation CDMA networks, but future cellular systems employing any signaling and multiple access techniques that take advantage of power control.

REFERENCES

[1] S. Sharma, A. R. Nix, and S. Olafsson, “Situation-aware wireless net-works,” IEEE Commun. Mag., vol. 41, no. 7, pp. 44–50, Jul. 2003. [2] J. S. Lee and L. E. Miller, CDMA Systems Engineering

Hand-book. Norwood, MA: Artech House, 1998, pp. 1111–1186. [3] J. Laiho, A. Wacker, and T. Novosad, Eds., Radio Network Planning and

Optimization for UMTS. New York: Wiley, 2002, pp. 280–290. [4] W. W. Lu, Broadband Wireless Mobile: 3G and Beyond. New York:

Wiley, 2002, pp. 307–315.

[5] V. V. Veeravalli and A. Sendonaris, “The coverage-capacity tradeoff in cellular CDMA systems,” IEEE Trans. Veh. Technol., vol. 48, no. 5, pp. 1443–1450, Sep. 1999.

[6] R. G. Akl, M. V. Hegde, M. Naraghi-Pour, and P. S. Min, “Multicell CDMA network design,” IEEE Trans. Veh. Technol., vol. 50, no. 3, pp. 711–722, May 2001.

[7] S. Sharma and A. R. Nix, “Situation awareness based automatic base station detection and coverage reconfiguration in 3G systems,” in Proc.

IEEE PIMRC, Lisbon, Portugal, Sept. 2002, pp. 16–20.

[8] Y. Ishikawa, T. Hayashi, and S. Onoe, “W-CDMA downlink transmit power and cell coverage planning,” IEICE Trans. Commun., vol. E85-B, no. 11, pp. 2416–2426, Nov. 2002.

[9] S. J. Park, D. Kim, and C. Y. Kim, “Optimal power allocation in CDMA forward link using dependency between pilot and traffic channels,” in

Proc. Veh. Technol. Conf., Fall, Amsterdam, The Netherlands, Sep.

1999, pp. 223–227.

[10] K. Mori and H. Kobayashi, “Dynamic cell configuration scheme for common channel communications in CDMA cellular packet systems,” in Proc. IEEE ICC, Paris, France, Jun. 2004, pp. 159–163.

[11] G. Hampel, K. L. Clarkson, J. D. Hobby, and P. A. Polakos, “The tradeoff between coverage and capacity in dynamic optimization of 3G cellular networks,” in Proc. IEEE Veh. Technol. Conf., Fall, Orlando, FL, Sep. 2003, pp. 927–932.

[12] R. T. Love, K. A. Beshir, D. Schaeffer, and R. S. Nikides, “A pilot optimization technique for CDMA cellular systems,” in Proc. IEEE

Veh. Technol. Conf., Fall, Amsterdam, The Netherlands, Sep. 1999, pp.

2238–2242.

[13] K. Valkealahti, A. Hoglund, J. Parkkinen, and A. Flanagan, “CDMA common pilot power control with cost function minimization,” in Proc.

IEEE Veh. Technol. Conf., Fall, Vancouver, BC, Canada, Sep. 2002, pp.

2244–2247.

[14] D. Kim, Y. Chang, and J. W. Lee, “Pilot power control and service cov-erage support in CDMA mobile systems,” in Proc. IEEE Veh. Technol.

Conf., Spring, Amsterdam, The Netherlands, May 1999, pp. 1464–1468.

[15] A. D. Smith, “Designing for coverage availability with different data rates—An improved methodology,” in Proc. IEEE Veh. Technol. Conf.,

Fall, Vancouver, BC, Canada, Sep. 2002, pp. 1821–1824.

[16] C. Y. Liao, F. Yu, V. C. M. Leung, and C. J. Chang, “Reinforce-ment-learning-based self-organization for cell configuration in mul-timedia mobile networks,” Eur. Trans. Telecomm., vol. 16, no. 5, pp. 385–397, Sep. 2005.

[17] A. Hoglund and K. Valkealahti, “Quality-based tuning of cell downlink load target and link power maxima in CDMA,” in Proc. IEEE

Veh. Technol. Conf., Fall, Vancouver, BC, Canada, Sep. 2002, pp.

2248–2252.

[18] B. Hashem and E. L. Strat, “On the balancing of the base stations trans-mitted powers during soft handoff in cellular CDMA systems,” in Proc.

IEEE ICC, New Orleans, LA, Jun. 2000, pp. 1497–1501.

[19] J. A. Flanagan and T. Novosad, “CDMA network cost function mini-mization for soft handover optimini-mization with variable user load,” in Proc.

IEEE Veh. Technol. Conf., Fall, Vancouver, BC, Canada, Sep. 2002, pp.

2224–2228.

[20] C. Y. Liao, “Downlink soft handoff mechanisms and cell reconfiguration planning in mixed-size wcdma cellular networks,” Ph.D. dissertation, National Chiao Tung Univ., Hsinchu, Taiwan, 2004.

[21] C. Y. Liao, L. C. Wang, and C. J. Chang, “Power allocation mecha-nisms for downlink handoff in the CDMA system with heterogeneous cell structures,” ACM/Kluwer WINET, vol. 11, no. 5, pp. 593–605, Sep. 2005.

[22] H. Furukawa, K. Hamabe, and A. Ushirokawa, “SSDT—Site selection diversity transmission power control for CDMA forward link,” IEEE J.

Sel. Areas Commun., vol. 18, no. 8, pp. 1546–1554, Aug. 2000.

[23] R. S. Sutton and A. G. Barto, Introduction to Reinforcement

Learning. Cambridge, MA: MIT Press, 1998.

[24] D. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Bel-mont, MA: Athena Scientific, 1996.

[25] 3GPP Tech. Specification 25.942, RF System Scenarios, p. 26, Dec. 1999.

[26] M. L. Putterman, Markov Decision Processes: Discrete Stochastic

Dy-namic Programming. New York: Wiley, 1994.

[27] R. E. Bellman, Dynamic Programming. Princeton, NJ: Princeton Univ. Press, 1957.

[28] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn., vol. 8, pp. 279–292, 1992.

[29] L. Jouffe, “Fuzzy inference system learning by reinforcement methods,”

IEEE Trans. Syst., Man, Cybern., pt. C, vol. 8, no. 3, pp. 338–355, Aug.

1998.

[30] Universal Mobile Telecommunication System (UMTS), “Selection pro-cedures for the choice of radio transmission technologies of the UMTS,” UMTS 30.03, version 3.2.0, TR 101 112, 1998.

(10)

Ching-Yu Liao (S’02–M’05) received the Ph.D. degree in communication engineering from the National Chiao Tung University (NCTU), Hsinchu, Taiwan, R.O.C., in 2004.

From 2004 to 2005, she was a Visiting Scholar in the Department of Electrical Engineering, University of British Columbia (UBC), Vancouver, BC, Canada. She is currently a Senior Research Scientist at Telcordia Applied Research Center, Taiwan Company (TARC-TW), Taipei. Her research interests include handoff techniques, radio resource management, heterogeneous wireless networks, etc.

Fei Yu (S’00–M’04) received the Ph.D. degree in electrical engineering from the University of British Columbia (UBC), Vancouver, BC, Canada, in 2003.

From 1998 to 1999, he was a System Engineer at China Telecom, P.R. China, working on the planning, design, and performance analysis of national SS7 and GSM networks. From 2002 to 2004, he was a Re-search and Development Engineer at Ericsson Mo-bile Platforms, Sweden, where he worked on dual-mode UMTS/GPRS handsets. He is currently a Re-search Associate at the UBC. His reRe-search interests are QoS, cross-layer design, and mobility management in wireless networks.

Victor C. M. Leung (S’75–M’89–SM’97–F’03) received the B.A.Sc. (Honors) and Ph.D. degrees from the University of British Columbia (UBC), Vancouver, BC, Canada, in 1977 and 1981, respec-tively, both in electrical engineering.

From 1981 to 1987, he was a Senior Member of Technical Staff and Satellite Systems Specialist at MPR Teltech, Ltd. In 1988, he was a Lecturer in Electronics at the Chinese University of Hong Kong. He returned to UBC as a faculty member in 1989, where he is a Professor and holder of the TELUS Mobility Research Chair in Advanced Telecommunications Engineering in the Department of Electrical and Computer Engineering. His research interests are in mobile systems and wireless networks.

Dr. Leung is member of the Association for Computing Machinery (ACM). He is an Editor of the IEEE TRANSACTIONS ONWIRELESSCOMMUNICATIONS, and an Associate Editor of the IEEE TRANSACTIONS ON VEHICULAR

TECHNOLOGY.

Chung-Ju Chang (S’84–M’85–SM’94) was born in Taiwan, R.O.C., in August 1950. He received the B.E. and M.E. degrees in electronics engineering from the National Chiao Tung University (NCTU), Hsinchu, Taiwan, R.O.C., in 1972 and 1976, respectively, and the Ph.D degree in electrical engineering from the National Taiwan University, Taipei, Taiwan, in 1985. From 1976 to 1988, he was with Telecommuni-cation Laboratories, Ministry of CommuniTelecommuni-cations, Taiwan, as a Design Engineer, Supervisor, Project Manager, and then Division Director. In 1988, he joined the Faculty of the Department of Communication Engineering, NCTU, as an Associate Professor. He has been a Professor since 1993. He was Director of the Institute of Communication Engineering from August 1993 to July 1995, Chairman of the Department of Communication Engineering from August 1999 to July 2001, and Dean of the Research and Development Office from August 2003 to July 2004, at NCTU. His research interests include performance evaluation, wireless communication networks, and broadband networks.

Dr. Chang is a member of the Chinese Institute of Engineers (CIE). He serves as an Editor for the IEEE Communication Magazine and an Associate Editor for the IEEE TRANSACTIONS ONVEHICULARTECHNOLOGY.