## 國 立 交 通 大 學

## 電信工程研究所

## 博 士 論 文

### 正交分頻多工無線網路之資源管理

### On Resource Management in OFDMA based

### Wireless Networks

### 研 究 生：林 淵 斌

### 指導教授：蘇 育 德 博士

### 正交分頻多工無線網路之資源管理

### Resource Management in OFDMA based Wireless

### Networks

### 研 究 生：林淵斌

### Student:

### Yuan-Bin

### Lin

### 指導教授：蘇育德 博士 Advisor: Dr. Yu T. Su

### 國立交通大學

### 電信工程研究所

### 博士論文

### A Dissertation

### Submitted to Institute of Communications Engineering

### College of Electrical and Computer Engineering

### National Chiao Tung University

### in Partial Fulfillment of the Requirements

### for the Degree of Doctor of Philosophy

### in

### Communications Engineering

### Hsinchu, Taiwan

**i **

### 正交分頻多工無線網路之資源管理

### 研究生:林淵斌

### 指導教授:蘇育德 博士

### 國立交通大學電信工程研究所

### 中文摘要

本文旨在探討正交分頻多工無線網路之資源管理並針對各式頻譜與功率分 配問題提出最佳與次佳的演算法。我們的演算法採取的是使用者去除的概念來解 耦(decouple)多用戶多載波分配的關連性。無線資源分配牽涉到眾多用戶需求條 件與系統參數與設計之選擇及實務考量，其複雜度相當高，在大部分情況下無法 有最佳的解決方案。本文所考量的情境(scenario)為單一基地台與多個移動台用戶 的細胞式通訊系統。 我們主要解決的問題有下列幾項。第一個問題為「在滿足不同用戶之不同傳 輸率的要求下使用最少的總功率或能量來分配既有的無線電資源（傳輸功率、能 量及次載波）。」其中，我們考慮各通道之增益雜訊比(channel-gain-to-noise-ratio) 之不同，提出最佳與次佳的演算法。第二問題則是試圖在單一用戶尖峰傳輸功率 的限制下對總加權傳輸率極大化。我們提出兩種次佳的頻譜分配演算法，其中之 一利用了對偶分解(dual-decomposition)的方法。第三問題考量了用戶公平性的問 題，因此針對用戶傳輸率總乘積之極大化提出一個次佳演算法。傳輸速率總乘積 之極大化可使系統在傳輸速率總和增加的同時儘可能地維持各用戶一定的傳輸 率! 最後一個問題，我們針對多輸入多輸出的通訊系統提出低複雜度的資源管理 演算法，使系統在滿足用戶之不同傳輸率要求下能讓總傳輸功率極小化。在此多 輸入多輸出的通訊系統中我們採取主對角線塊狀化(block diagonalization)來使其 每一載波通道皆可允許多個用戶在無彼此干擾下傳輸。我們開發的演算法亦應用 了對偶分解法來解決用戶的移除與選擇。與第二個問題的不同是：每一載波可以 保留給多個用戶。基於主對角線塊狀化的特性我們所提出的演算法在保留使用效 率高的用戶之同時亦兼顧了保持其空間通道正交的優勢。對於上述各類資源管理 問題所提出的最佳或次佳分配解我們均分析了其複雜度並以電腦模擬證明所提 出的演算法皆能有甚佳的效能表現。### On Resource Management in OFDMA Based Wireless Networks

Student: Yuan-Bin Lin Advisor: Dr. Yu T. Su

Department of Communications Engineering National Chiao Tung University

Absract

Algorithms for finding suboptimal and optimal solutions to total power minimization or capacity maximization resource allocation problems in OFDMA-based networks haven been studied by many authors. But the complexities of finding the optimal solution and some suboptimal are prohibitively high and only few numerical examples for low dimension cases can be found. On the other hand, low-complexity suboptimal solutions often give unsatisfactory performance.

In this thesis, we propose optimal and suboptimal resource allocation solutions for OFDMA and MIMO-OFDMA wireless networks. Various design criteria and system constraints are considered. The corresponding complexities for all suboptimal solutions are relatively low while that for the optimal algorithm is only moderate high. We first investigate the problem of transmit power minimization in an OFDMA downlink network subject to user rates and BER requirements constraints. We provide near-optimal and near-optimal solutions based on the dynamic programming and branch& bound methodologies. The second scenario we consider is a weighted sum rate maximization problem which is solved via dynamic programming and dual decomposition. We then proceed to consider the product rate maximization scenario and present a suboptimal solution. Finally, we consider a total power minimization problem for a MIMO-OFDMA wireless networks and present a low-complexity solution.

The main concept in our proposed algorithms can be easily applied to obtain a near-optimal solution for many similar multi-constraints optimization problem with low complexity.

### Acknowledgement

I would like to express my sincere appreciation to my dissertation advisor, Dr. Yu T. Su, for his enthusiastic guidance, for the latitude he gave me and for his encouragement. Without his suggestions, support, and kindly encouragement, the course of this work cannot be under way successfully. Thanks to the members in Lab 811, especially for Tai-Hsiang Chiu and all of my friends for their kindly help. Finally, I would like to thank my family, my parents, my wife, for their encouragement and endless love.

### Contents

Chinese Abstract i

English Abstract ii

Acknowledgements iv

Contents v

List of Figures viii

List of Tables x

1 Introduction 1

2 Review of Some Optimization Methodologies 5

2.1 Introduction to Dynamic Programming . . . 5

2.1.1 The Basic Problem . . . 5

2.1.2 The Dynamic Programming Algorithm . . . 7

2.2 The Branch and Bound Principle . . . 8

2.2.1 Terminology and General description . . . 10

2.2.2 Bounding Function . . . 10

2.2.3 Branching Rule . . . 11

2.2.4 Strategies for Selecting Next Subproblem . . . 12

2.3 Dual Decomposition method for Non-convex Optimization . . . 14

2.4 Efficient Suboptimal Non-convex Optimization via Dual Decomposition . 16

3 Power Minimization Resource Allocation Algorithms 19

3.1 Assumptions and Problem Formulation . . . 19

3.1.1 Basic Assumptions . . . 19

3.1.2 Problem Formulation . . . 20

3.2 Dynamic Programming based Near-optimal Resource Allocation . . . 21

3.3 An Optimal Resource Allocation Algorithm . . . 27

3.3.1 A Branch-and-bound Approach . . . 27

3.3.2 Sorting The Serving Channels . . . 31

3.3.3 Complexity Reduction Techniques . . . 32

3.3.4 Application to Integer Constellation Systems . . . 34

3.4 Numerical Results and Algorithmic Complexity . . . 35

3.4.1 Relative Efficiency Performance . . . 35

I _{Performance of BBRA and DPRA algorithms . . . .} _{35}

I _{Performance of Representative Sub-optimal Algorithms .} _{38}
3.4.2 Complexity Evaluation . . . 39

I _{Average complexities of the proposed algorithms . . . .} _{39}

I _{Complexities of Other Sub-optimal Algorithms . . . . .} _{40}

3.5 Chapter Summary . . . 41

4 Capacity Maximization Resource Algorithms 46 4.1 System Description . . . 46

4.2 Resource Allocation Algorithm for Weighted Sum Rate Maximization . . 47

4.2.1 A Dynamic Programming Based Algorithm . . . 48

4.2.2 A Low Complexity Algorithm via Dual Decomposition . . . 49

4.2.3 Numerical Results and Discussions . . . 51

4.3 Resource Allocation Algorithms for Product Rate Maximization . . . 55

4.3.1 Nash Bargaining Solution and Problem Formulation . . . 56

4.3.2 DP Based Algorithm for Product Rate Maximization . . . 58

4.3.3 Numerical Results and Discussions . . . 58

5 Resource Allocation for MIMO-OFDMA Downlinks 61 5.1 System Description . . . 63

5.2 The Block Diagonalization Approach . . . 65

5.3 Problem Formulation and the Optimal Solution . . . 66

5.4 A Low-complexity Power-Minimization Solution . . . 68

5.4.1 Correlation Based User Selection Method via Dual Decomposition 71 5.5 Numerical Results and Discussion . . . 72

6 Conclusion 77

A An optimal mono-rate power allocation algorithm 79

Bibliography 84

### List of Figures

2.1 Illustration of the search space for a B&B procedure. . . 9 2.2 The relation between the bounding function g and the objective function

f on the sets S and G of feasible and potential solutions of a problem. . 10 2.3 Search strategies in B&B: the Best-First Search. . . 12 2.4 Search strategies in B&B: the Breath First Search. . . 13 2.5 Search strategies in B&B: the Depth First Search. . . 14

3.1 A complete search tree for multiuser channel allocation. For the DPRA algorithm, only one child node survives at each level. . . 41 3.2 The DPRA algorithm’s probability of correct convergence (i.e., the

prob-ability of obtaining the optimum subcarrier/power/rate allocation) in an OFDMA downlink. . . 42 3.3 Average relative efficiency (η) performance of the DPRA algorithm. . . . 43 3.4 Average relative efficiency of the DPRA, MDPRA, and MRCG

algo-rithms; Rj = 5 bps/Hz for all j, IF = independent fading, CF = correlated

fading. . . 44 3.5 Average complexities (numbers of calls to the OMPA algorithm) for the

BBRA and DPRA algorithms in a 64-subcarrier OFDMA system. . . 44 3.6 Average complexities (numbers of calls to the OMPA algorithm) for the

BBRA and DPRA algorithms in a 128-subcarrier OFDMA system. . . . 45

3.7 Average number of power (or rate) level evaluation iterations for various dynamic RA algorithms; Rj = 5 bps/Hz for all j, IF = independent

fading, CF = correlated fading. . . 45

4.1 Average sum rate performance as a function of number of users; P = 60, N = 64, GNR = 0 dB. . . 53 4.2 Average fairness index performance as a function of number of users;

P = 60, N = 64, GNR = 0 dB. . . 54 4.3 System throughput improvement versus number of users in a correlated

fading channel; N = 128, and GNR = 0 dB. . . 55 4.4 Fairness index performance as a function of number of users; N = 64, P =

60, GNR = 0 dB. . . 59 4.5 Fairness index versus number of users, (N = 128, GNR = 0 dB) for

correlated and independent fading channels. . . 60 4.6 Sum rate performance versus number of users in a correlated channel;

N = 128, GNR = 0 dB. . . 60

5.1 Block diagram of a multiuser MIMO system. . . 64 5.2 Average power per user for the case of 8 subcarriers, 8 users; GNR=0dB,

NT = 4, nr = 1, and R = 10, 20 respectively. . . 74

5.3 Average power per user for K = 32, N = 64, R = 20, nr = 2, GNR=0dB,

and NT = 8, 16 respectively. . . 75

5.4 Average power per user for K = 32, N = 64, R = 20, nr = 2,

GNR=-10dB, and NT = 8, 16 respectively. . . 76

### List of Tables

2.1 An efficient suboptimal optimization method for non-convex optimization

via dual Decomposition. . . 18

3.1 A dynamic programming based resource allocation (DPRA) algorithm . . 26

3.2 A branch and bound based resource allocation (BBRA) algorithm . . . . 33

3.3 The effects of (1) channel-sorting in Table 3.4, (2) Guideline 1, (3)
Prop-erty 3/Guideline 4, and (4) Guideline 5 on the computing complexity
reduction of the BBRA algorithm; 106 _{runs are performed to obtain the}
statistics. The complexity is measured in terms of numbers of calls nop
to the OMPA algorithm. The complexity of the DPRA algorithm is also
included for comparison purpose. . . 33

3.4 The channel-sorting algorithm . . . 37

4.1 A resource allocation algorithm for weight sum rate maximization. . . 52

5.1 A joint channel assignment and user selection algorithm . . . 73

### Chapter 1

### Introduction

As the demand for high data rate multi-media wireless communications increases, it also becomes more and more important that one takes into account the energy/spectral efficiency factor in designing an anti-fading transmission scheme for mobile terminals. A fast and proper adaptive algorithm in allocating both the physical and MAC layer resources is essential to provide high quality high rate multiuser transmissions. Because of its robustness against frequency-selective fading and its flexibility in appropriating the transmission resources, the OFDM-based Frequency Division Multiple Access (OFDMA) scheme in which each user is allocated a collection of time slots and sub-carriers for transmission, has been adopted in several industrial wireless communication standards. If the allocation is predetermined and static, there may be unused sub-carriers and time slots if the designated users do not need so many signal dimensions.

When there are limited power and multiple orthogonal channels available for trans-mitting multiuser/multimedia signal, a proper channel and power allocation scheme is needed to minimize the average power consumption, co-channel interferences while meet-ing various users and media’s rate requirement and maintainmeet-ing the link quality. For an OFDMA system, this problem is complicated by the fact that a subcarrier (channel)1

is bad, in deep fade and with low channel signal-to-noise ratio (SNR) for one user may be good (with high channel SNR) for another user. In [1], the authors proposed a

sub-1_{We shall use the terms subcarrier and channel interchangeably throughout this paper.}

optimal multiuser subcarrier/bit allocation scheme which minimizes the total transmit power with rate constraints. They relaxed the discrete-(integer-)rate constraint by al-lowing time-sharing use of a subcarrier by multiple users–an idealized assumption that was subsequently used by many investigators. [4] considered a continuous-rate version of the same problem but forbad the multiple-user-per-subcarrier scenario and suggested a method for computing the optimal solution. Obtaining the exact optimal solutions to either problem requires high computing complexity and become impractical for large channel and/or user constraints. There are many works that studied variations and ex-tensions of [1] or [4]; each deals with different objective function (maximizing weighted sum rate [9], utility [12]), constraint (fairness [7], proportional rates [8]), or scenario (e.g., multi-cell [10], relay-aided [7]). A survey on various dynamic resource allocation (RA) solutions was recently reported in [6].

In Chapter 3 we present two efficient algorithms for solving the problems of [1] and [4], i.e., efficient subcarrier, power and rate assignment schemes that satisfy multi-user multi-media requirements with the minimum total transmitted power are given. The first algorithm uses a dynamic programming (DP) approach; it is simple and offers near-optimal performance. The second algorithm invokes the branch-and-bound (B&B) principle, uses a good initial bound and tight lower bounds along with some complexity-reduction techniques. It gives the optimal solution with a moderate increase of com-plexity. Our discourse concentrates on the continuous-rate case but both algorithms can be used for the discrete-rate case with a minor modification (see Section IV.D). It is not difficult to see that, through suitable modifications, our algorithms can also be applied to solve a similar RA problem of maximizing the aggregated throughput or weighted sum rate with individual power constraints. The rest of this paper is organized as follows. The ensuing section describes the operation scenarios of concern and gives an optimization problem formulation. Section III presents the proposed DP-based resource allocation algorithm and the B&B-based approach is given in Section IV. We also derive

some useful properties and suggest design guidelines there. Numerical performance of the proposed algorithm and some existing suboptimal algorithms is presented in Section V. Finally, we give concluding remarks in Section VI and derive an optimal mono-rate (single user) power allocation (OMPA) algorithm in Appendix A.

Most of the previous approaches show how to efficiently maximize total transmission or minimize the total transmitted power under the related constraints of system and users. In addition to the issue that to minimize total transmitted power under all users’ constraints can ensure the benefits of users far away from the base station, most formulated problems and the corresponding solutions are focus on the efficiency issue such that the users closer to the base station or with higher power capability will get the most resource/benefits. The fairness issue in resource allocation is addressed in Chapter 4. First, we discuss the problem of maximizing the weighted sum rate under uplink users’ power constraints. A similar weighted downlink sum rate maximization problem with a total power constraint has be investigated and the corresponding optimal solution was known [4]. The problem of maximizing the ergodic rates was discussed in [14].

We propose two efficient suboptimal resource allocation algorithms for the weighted sum rate maximization problem. The first algorithm is a modification of the DPRA algorithm, replacing the original cost function and the OMPA algorithm. The sec-ond solution exploits the dual decomposition method in convex optimization theory. The resulting algorithm requires lower complexity but suffers from minor performance degradation. In the second part of chapter 4, the proportion fairness is considered. The distributed users can negotiate via the BS to make their decisions on the subcarrier usage cooperatively such that all users jointly agreements are made. This kind of cooperative problem motivates us to apply the game theory and especially cooperative game theory can achieve the fairness and maximize the overall system rate [15], [16], and [17].

Fair-rate allocation for classical OFDMA systems based on the Nash bargaining solution (NBS) have been recently considered in [16]. As the proposed solution was too

computational intensive, the number of users was limited to 8 or less. Our DP-based algorithm is much simpler while its performance is super. The corresponding optimal solution can still be obtained by the BBRA approach with minor modifications.

Finally, we extend our investigation to MIMO-OFDMA systems, focusing on the total transmitted power minimization under users’ BER and rate requirements. The multiple antennas at the base station are built for spatial multiplexing of transmissions to multiple users in the same time subcarrier. Our precoding scheme is based on the block diagonalization method. With block diagonalization, each user’s precoding matrix is designed such that the transmitted signal of that user lies in the null space of all other remaining users’ and multiuser interference is pre-eliminated. The resource allocation problem in a MIMO OFDMA system using block diagonalization becomes that of se-lecting users who can share the same subcarrier for all subcarriers. [19] has discussed the similar power minimization problem without user selection over each single carrier. [21] proposed a user selection scheme based on the users’ channel conditions and correla-tions for MISO systems. Both considered only the single carrier case whence frequency assignment is not needed. The propose of Chapter 5 is to present a dual decomposi-tion based low-complexity suboptimal soludecomposi-tion which employs a correladecomposi-tion-based user selection scheme to simultaneously complete the task of user selection and subcarrier assignment in a MIMO OFDMA system.

### Chapter 2

### Review of Some Optimization

### Methodologies

### 2.1

### Introduction to Dynamic Programming

The dynamic programming (DP) was coined by Bellman [22] to describe the tech-niques which he brought together to study a class optimization problems involving se-quences of decisions. There have been many applications and further developments since its inception. In this thesis, we focus on the situations where decisions are made in stages.

### 2.1.1

### The Basic Problem

We now formulate a general multi-stage statistical decision problem under stochastic uncertainty. This problem, which is called basic, is very general. In particular, it is not necessary to require that the state, control, or random parameter take a finite number of values or belong to a space of n-dimensional vectors. An attractive aspect of dynamic programming is that its applicability depends very little on the nature of the state, control, and random parameter spaces. For this reason, it is convenient to proceed without any assumptions on the structure of these spaces.

We are given a discrete-time dynamic system

xk+1 = fk(xk, uk, wk), k = 0, 1, ..., N − 1 (2.1)

where

k the index of discrete time,

xk the state of the system and summarized past information that is relevant for

future optimization, xk∈ Sk,

uk the control or decision variable to be selected at time k, uk ∈ Ck,

wk a random parameter,

N the horizon or number of times control which is applied,

fk a function that describes the system and in particular the mechanism by which

the state is updated.

The control variable uk is constrained to take values in a given nonempty subset

U (xk) ⊂ Ck, which depends on the current state xk; that is, uk ∈ Uk(xK) for all xk∈ Sk

and k. We consider the class of policies (also called control laws) that consist of a sequence of functions

π = {µ0, ..., µN −1} (2.2)

where µk maps state xk into controls µk = µk(xk) and is such that µk(xk) ∈ Uk(xk) for

all xk∈ Sk. Such policies are called admissible.

Given an initial state x0 and an admissible policy π = {µ0, ..., µN −1}, the states xk

and disturbances wkare random variables with distributions defined through the system

equation

xk+1 = fk(xk, µk(xk), wk), k = 0, 1, ..., N − 1 (2.3)

Thus, for given functions, gk, k = 0, 1, ..., N , the expected cost of π starting at x0 is

Jπ(x0) = E ( gN(xN) + N −1 X k=0 gk(xk, µk(xk), wk) ) (2.4)

where the expectation is taken over the random variables wk and xk. An optimal policy

π∗ _{is one that minimizes the cost; that is,}

Jπ∗(x_{0}) = min

π∈ΠJπ(x0) (2.5)

where Π is the set of all admissible policies.

### 2.1.2

### The Dynamic Programming Algorithm

The dynamic programming technique is built upon a very simple idea, the principle of optimality which can be formally stated by

Principle of Optimality
Let π∗ _{= {µ}∗

0, µ∗1, ..., µ∗N −1} be an optimal policy for the basic problem, and assume that

when using π∗_{, a given state x}

i occurs at time i with positive probability. Consider the

subproblem whereby we are at xi at time i and wish to minimize the “cost-to-go” from

time i to time N E ( gN(xN) + N −1 X k=i gk(xk, µk(xk), wk) ) (2.6)

Then the truncated policy {µ∗

i, µ∗i+1, ..., µ∗N −1} is optimal for this subproblem.

The principle of optimality suggests that an optimal policy can be constructed in piecemeal fashion, first constructing an optimal policy for the “tail problem” involving the last stage, then extending the optimal policy to the “tail problem” involving the last two stages, and continuing in this manner until an optimal policy for entire problem is constructed. The dynamic programming algorithm is based on this idea: it proceeds sequentially, by solving all the tail subproblems of a given (time) length.

We now state the dynamic programming algorithm for the basic problem.

The Dynamic Programming Algorithm

For every initial state x0, the optimal cost Jπ∗(x0) of the basic problem is equal to J0(x0),

given by the first step of the following algorithm, which proceeds forward in time from stage 1 to stage N : J0(x0) = g0(x0), Jk(xk) = min uk∈Uk(xk), wk E{gk(xk, uk, wk) + Jk−1(fx(xk, uk, wk))}, k = 1, ..., N (2.7) 7

where the expectation is taken with respect to wk, which depends on xk and uk.

Fur-thermore, if u∗

k = µ∗k(xk) minimizes the right side of (2.7) for each xk and k, the policy

π∗ _{= {µ}∗

0, µ∗1, · · · , µ∗N −1} is optimal.

The term “dynamic programming” coined by Richard Bellman was originally referred to the process of solving problems where one needs to find the best decisions one after another. It was later refined to referring to the general approach of nesting smaller deci-sion problems inside larger decideci-sions. Equivalently, DP means simplifying a complicated problem by breaking it down into simpler subproblems in a recursive manner. Note that a dynamic programming algorithm is capable of obtaining the optimal solution only if the “cost-to-go” can be decomposed into the recursive form of (2.6).

### 2.2

### The Branch and Bound Principle

Solving an NP-hard discrete optimization problem is often an immense job requiring a very efficient algorithm, and the Branch and Bound (B&B) paradigm is one of the main tools used in constructing such a solution. A B&B method searches for the best solution in the complete space of solutions according to a given problem. However, explicit enumeration is normally impossible due to the exponentially increasing number of potential solutions. The use of bounds for the function to be optimized combined with the value of the current best solution enables the algorithm to search parts of the solution space only implicitly.

At any point during the solution process, the status of the solution with respect to the search of the solution space is described by a pool of yet unexplored subset of this and the best solution found so far. Initially only one subset exists, namely the complete solution space, and the best solution found so far is ∞. The unexplored subspaces are represented as nodes in a dynamically generated search tree, which initially only contains the root, and each iteration of a classical B&B algorithm processes one such node. The iteration has three main components: selection of the node to process, bound

calculation, and branching. In Fig. 2.1, the initial situation and the first step of the process are illustrated.

The sequence of these may vary according to the strategy chosen for selecting the next node to process. If the selection of next subproblem is based on the bound value of the subproblems, then the first operation of an iteration after choosing the node is branching. For each of these, it is checked whether the subspace consists of a single solution, in which case it is compared to the current best solution keeping the best of these. Otherwise the bounding function for the subspace is calculated and compared to the current best solution. If the subspace cannot contain the optimal solution, the whole subspace is discarded. The search terminates when there are no unexplored parts of the solution space left, and the optimal solution is then the one recorded as ”current best”

### S

### S

1### S

2### S

3### S

1### S

2### S

3### S

11### S

22### *S

3### S

12### S

21### S

11### S

21### S

3### S

12### S

22 * = Doesn’t contain optimal solutionFigure 2.1: Illustration of the search space for a B&B procedure.

### 2.2.1

### Terminology and General description

In the following subsection, we consider minimization problems - the case of maxi-mization problems can be dealt with similarly. The problem is to minimize a function f (x) of variables (x1. . . xn) over a region of feasible solutions, S :

min

x∈S f (x)

The function f is called the objective function and may be of any type. The set of feasible solutions is usually determined by general conditions on the variables, e.g. that these must be non-negative integers or binary, and special constraints determining the structure of the feasible set. In many cases, a set of potential solutions, G, containing S, for which f is still well defined. A function g(x) often defined on G (or S) with the property that g(x) ≤ f (x) for all x in S arises naturally. Both S and G are very useful in the B&B context. Fig. 2.2 illustrates the situation where S and G are intervals of real numbers.

### S

### G

### f

### g

Figure 2.2: The relation between the bounding function g and the objective function f on the sets S and G of feasible and potential solutions of a problem.

### 2.2.2

### Bounding Function

The bounding function is the key component of any B&B algorithm in the sense that a low quality bounding function cannot be compensated for through good choices

of branching and selection strategies. Ideally the value of a bounding function for a given subproblem should equal the value of the best feasible solution to the problem, but on account of obtaining this value is usually in itself NP-hard, the goal is to come as close as possible using only a limited amount of computational effort. A bounding function is called strong, if it in general gives values close to the optimal value for the subproblem bounded, and weak if the values produced are far from the optimum. One often experiences a trade off between quality and time when dealing with bounding functions: The more time spent on calculating the bound, the better the bound value usually is. It is normally considered beneficial to use as strong a bounding function as possible in order to keep the size of the search tree as small as possible.

Bounding functions naturally arise in connection with the set of potential solutions G and the function g mentioned in above. Due to the fact that S ⊆ G, and that g(x) ≤ f (x) on G, the following is easily seen to hold:

min x∈Gg(x) ≤ minx∈G f (x) minx∈S g(x) ≤ min x∈S f (x) (2.8)

If both of G and g exist there are now choices between three optimization problems, for each of which the optimal solution will provide a lower bound for the given objective function. The ”skill” here is of course to chose G and/or g so that one of these is easy to solve and provides tight bounds.

### 2.2.3

### Branching Rule

All branching rules in the context of B&B can be seen as subdivision of a part of the search space through the addition of constraints, often in the form of assigning values to variables. Convergence of B&B is ensured if the size of each generated subproblem is smaller than the original problem, and the number of feasible solutions to the original problem is finite. Normally, the subproblems generated are disjoint - in this way the problem of the same feasible solution appearing in different subspaces of the search tree is avoided.

### 2.2.4

### Strategies for Selecting Next Subproblem

The strategy for selecting the next live subproblem to investigate usually reflects a trade off between keeping the number of explored nodes in the search tree low, and staying within the memory capacity of the computer used.

If one always selects among the live subproblems one of those with the lowest bound, called the best first search strategy, BeFS. Fig. 2.3 shows a small search tree -the numbers in each node corresponds to the sequence. A subproblem P is called critical if the given bounding function when applied to P results in a value strictly less than the optimal solution of the problem in question. Nodes in the search tree corresponding to critical subproblems have to be partitioned by the B&B algorithm no matter when the optimal solution is identified - they can never be discarded by means of the bounding function. Since the lower bound of any subspace containing an optimal solution must be less than or equal to the optimum value, only nodes of the search tree with lower bound less than or equal to this will be explored.

*f=1, g=0 *
*f=1, g=0.5 * *f=2, g=1 *
6 5 7
*f=3, g=2 * *f=2, g=1.5 *
9 8
*f=3, g=1.5 f=4, g=2 *
1
2 3
4
*f=5.5, g=3 f=5, g=2.5 *

Figure 2.3: Search strategies in B&B: the Best-First Search.

Even though the choice of the subproblem with the current lowest lower bound makes good sense also regarding the possibility of producing a good feasible solution, memory

problems arise if the number of critical subproblems of a given problem becomes too
large. The situation more or less corresponds to a breath first search strategy, BFS, in
which all nodes at one level of the search tree are processed before any node at a higher
level. Fig. 2.4 shows the search tree with the numbers in each node corresponding to the
BFS processing sequence. The number of nodes at each level of the search tree grows
exponentially with the level making it infeasible to do breadth first search for larger
problems.
*f=1, g=0 *
*f=1, g=0.5 * *f=2, g=1 *
4 6 7
*f=3, g=2 * *f=2, g=1.5 *
9
*f=3, g=1.5 f=4, g=2 *
1
2 3
5
*f=5.5, g=3 f=5, g=2.5 *
8

Figure 2.4: Search strategies in B&B: the Breath First Search.

The alternative used is a depth first search strategy, DFS. Here a live node with largest level in the search tree is chosen for exploration. Fig. 2.5 shows the DFS processing sequence number of the nodes. The memory requirement in terms of number of subproblems to store at the same time is now bounded above by the number of levels in the search tree multiplied by the maximum number of children of any node, which is usually a quite manageable number. An advantage from the programming point of view is the use of recursion to search the tree - this enables one to store the information about the current subproblem in an incremental way, so only the constraints added in connection with the creation of each subproblem need to be stored. The drawback is that if the incumbent is far from the optimal solution, large amounts of unnecessary

bounding computations may take place. In order to avoid this, DFS is often combined with a selection strategy which is that exploring the node with the small lower bound first hopefully leads to a good feasible solution.

*f=1, g=0 *
*f=1, g=0.5 * *f=2, g=1 *
3 8 9
*f=3, g=2 * *f=2, g=1.5 *
6
*f=3, g=1.5 f=4, g=2 *
1
2 7
4
*f=5.5, g=3 f=5, g=2.5 *
5

Figure 2.5: Search strategies in B&B: the Depth First Search.

### 2.3

### Dual Decomposition method for Non-convex

### Op-timization

We consider a non-convex optimization problem defined over domain D, the set of all non-negative real xj(i) for j = 1, · · · , K and i = 1, · · · , N such that for each i at

most one xj(i) is positive for k = 1, · · · , K, as follows

max {xj} N X i=1 K X j=1 fij(xj(i)) = N X i=1 K X j=1

log2(1 + xj(i)aj,i)

s.t. N X i=1 xj(i) ≤ Pj where xj = {xj(1)xj(2) · · · xj(N )} (2.9)

Then the Lagrangian of the above problem can be represented as

L{xj(i), Λ} = K X j=1 N X i=1 fij(xj(i)) − K X j=1 λj N X i=1 xj(i) − Pj ! (2.10) 14

where Λ = (λ1, λ2, · · · , λK). The Lagrange dual function is given by

g(Λ) = max

{xj}∈D

L{xj(i), Λ} (2.11)

The maximization of L can be decomposed into N independent optimization
prob-lems given by
g_{i}0(Λ) = max
{xj}∈D
( _{K}
X
j=1
fij(xj(i)) −
K
X
j=1
λjxj(i),
)
, i = 1, · · · , N (2.12)

The Lagrange dual function can be reformulated as

g(Λ) =
N
X
i=1
g_{i}0(Λ) +
K
X
j=1
λjPj. (2.13)

With a fixed Λ, the argument on the right hand side of (2.12) becomes a convex function of {xj = (xj(1) xj(2) · · · xj(N ))}. As a result, we can take the derivative of

the above function with respect to xj(i) and obtain the gi0(Λ) maximization solution

xj(i) =
λ0_{j} − 1
aj,i
+
, (2.14)
where λ0

j = 1/(log 2 · λj) and (t)+ def= max(0, t).

Since for each i = 1, 2, · · · , N only one xj(i) can be positive, we search over all K

possible user assignments for i = 1, 2, · · · , N , and decide that xjo(i) > 0, where jo and

g0

i(Λ) are

jo = arg max 1≤j≤K

"

log2(1 + xj(i)aj,i) − λj

λ0j−
1
aj,i
+#
g_{i}0(Λ) = max
1≤j≤K
"

log_{2}(1 + xj(i)aj,i) − λj

λ0_{j} − 1
aj,i

+#

(2.15)

We need to modify {λk} to meet the constraints

PN

i=1xj(i) = Pj. Even if the constraints

are satisfied there is no guarantee that the solution is optimal unless a convergence criterion is in place.

### 2.4

### Efficient Suboptimal Non-convex Optimization

### via Dual Decomposition

For the dual decomposition method just discussed, the process to obtain the optimal solution consists of the following main steps. Step 1. fixed a multiplier vector {λj}. Step

2. for each 1 ≤ i ≤ N , decide the index j∗ _{whose {x}

j∗(i)} is maximal among other values

{xj(i) j 6= j∗} which will be forced to zero. In Step 3, based on the result obtained in

Step 2 check if this current multiplier vector is optimal, i.e. if the constraints in 2.9 are met. If the answer is negative, the search of multiplier vector is needed.

The main ingredients of the above method are (i) releasing the constraints in (2.9) initially and by applying the dual decomposition approach to obtain a local optimal solution which takes into account the other N constraints such that only one xj(i) is

positive among 1 ≤ j ≤ K and i = 1, 2, · · · , N , and (ii) finding the multiplier vector used in the dual decomposition approach to meet the constraints (2.9). The relation between the decision of which one xji can be positive and a different given multiplier

vector is not obvious such that the search of the optimal multiplier is complicated. Based on these discussion, we propose an efficient suboptimal algorithm via dual decomposition. The main concept in our algorithm is that we release the N constraints in which each constraint denotes only one xj(i) > 0, for 1 ≤ j ≤ K to replace releasing

K constraints {PN

i=1xj(i) ≤ Pj, for 1 ≤ j ≤ K}. In other words, we extend the domain

D such that xj(i) can be positive for 1 ≤ j ≤ K, 1 ≤ i ≤ N .

Then run a finite number of iterations which at most is to be N in order to taking into count some constraint among the previously released constraints. Within each iteration, we can get a multiplier vector under the constraints {PN

i=1xj(i) ≤ Pj, for 1 ≤ j ≤ K}

and try to meet one of the released constraints before current iteration. In detail, we exploit this multiplier vector into dual decomposition and get the efficiency value for

1 ≤ i ≤ N and 1 ≤ j ≤ K as following.

j,i = fij − λj(xj(i)) (2.16)

In addition, we have to decide which one among released constraints previously to
be meet with each iteration. We take the sum of efficient values over all 1 ≤ j ≤ K for
all released constraints. We select i∗ _{whose sum efficiency value ν}∗

i is largest which is given by i∗ = arg max i∈S ν ∗ i = arg max i∈S K X j=1 j,i (2.17)

where S denotes a set whose elements represent the indices of released constraints. We decide which xj(i∗) can be positive in the similar way in optimal dual decomposition

method by (2.15). The difference between ours and optimal dual decomposition method
occurs. In our algorithm, we just take i = i∗ _{into (2.15) not i = 1, 2, · · · , N . In addition,}

the constraint i = i∗ _{will be not discussed anymore. As a result, after N iterations, the}

final solution will meet all the constraints. Detailed procedure is given in the following table.

Step 1: (Extend the domain D

by release constraints on only one xj(i) is positive

among 1 ≤ j ≤ K for all 1 ≤ i ≤ N )

Set C(j) = {i | 1 ≤ i ≤ N }, for 1 ≤ j ≤ K,

where m ∈ C(j) represents that xj(m) can be positive

set S = {1, 2, · · · , N } and t = 0

Step 2: (Select one of released constraints for taking into count ) Based on C(j) and PN

i=1xj(i) ≤ Pj for all 1 ≤ j ≤ K,

get a multiplier λt
j for maximization
PN
i=1fij(xj(i))
if t < N
i∗ _{= arg}
i∈Sνi

Set S = S\{i∗_{}, t = t + 1, then goto Step 3.}

else goto Step 5. end

Step 3: Decide which xj(i) can be positive for all 1 ≤ j ≤ K,

when given i = i∗_{, j}∗ _{= arg max }
j,i∗.

goto Step 4

Step 4: (Modify set C(j) for 1 ≤ j ≤ K.) for j = 1 : K

if j 6= j∗ _{then} _{C(j)\{i}∗_{} end}

end

goto Step 2

Step 5: (Output) a suboptimal solution is obtained

Table 2.1: An efficient suboptimal optimization method for non-convex optimization via dual Decomposition.

### Chapter 3

### Power Minimization Resource

### Allocation Algorithms

This chapter is concerned with a simple but general scenario in which the object is to allocate the available multiple carriers to multiple users such that the total transmit power is minimized while each user’s rate and BER requirements are met. As will become clear in later chapters, the methodologies presented in this chapter serve as prototype approaches which can be easily modified to solve problems arose from other similar system design criteria.

### 3.1

### Assumptions and Problem Formulation

### 3.1.1

### Basic Assumptions

We assume that there are N orthogonal subcarriers. C = {1, 2, · · · , N } and d user data streams with the rate requirements R = {Rj, j = 1, 2, · · · , d} to be transmitted

over an OFDMA downlink, where the required transmission rate of user j is denoted by Rj. We further assume that the base station and d user terminals are each equipped with

single antenna. The base station assigns a set of subcarriers to each user and determines the power and number of bits per OFDM symbol to be transmitted on each subcarrier. The cyclic prefix (guard interval) is long enough to remove all intersymbol interference caused by multipath propagation. In addition, sharing the same subcarrier by different users is not allowed. The base station’s resource allocation decision is sent to all users

through a separate control channel. At each terminal, user can demodulate the signals over those subcarriers assigned to it. Denoting by cij the bit rate of the ith subcarrier

which serves the jth user, we can express the maximum achievable rate (capacity) cij

using transmitted power pij as

cij = Wilog2 1 + pij|hij| 2 σ2 ij , 1 ≤ i ≤ N, 1 ≤ j ≤ d (3.1)

where Wi is the bandwidth for channel i, |hij|2 and σ2ij denote the channel gain and noise

power of the ith channel which serves the jth user. The normalized capacity (rate) rij

of the ith channel when used for serving the jth user is given by

rij =
cij
Wi
= log_{2}
1 + |hij|
2_{p}
ij
σ2
ij
= log_{2}(1 + aijpij) , (3.2)

where aij = |hij|2/σij2 is the corresponding channel gain-to-noise ratio (GNR).

### 3.1.2

### Problem Formulation

Given the multi-user transmission requirements and channel state information (i.e., aij’s), one would like to find the subcarrier assignment and power allocation that

min-imize the total transmitted power. We define the N × d subcarrier assignment matrix

A = [Aij] by Aij = 1 if the ith subcarrier is used to transmit the jth user; otherwise,

Aij = 0. As a subcarrier can only serve one user at a given time interval, Aij is either 1

or 0 and a legitimate channel assignment matrix A must satisfy

d X j=1 Aij ≤ 1, N X i=1 Aij ≥ 1, 1 ≤ i ≤ N, 1 ≤ j ≤ d (3.3)

For the downlink case, all signals are transmitted from the same base station, hence only the total transmitter power will be considered. Let P be the power allocation matrix with (i, j)th entry, pij, then the problem of concern becomes

min P,A N X i=1 d X j=1 Aijpij s.t. X i∈C(j) rij ≥ Rj, d X j=1 Aij ≤ 1 where C(j) = {i|Aij = 1, 1 ≤ i ≤ N } (3.4) 20

Although in reality there is a total power constraint PN

i=1

Pd

j=1pij ≤ Pc, we shall not

consider this constraint to begin with. Solving the problem with the total power con-straint follows a two-step procedure. In the first step we solve the unconstrained problem to obtain the required optimal total power and then check if the solution meets the to-tal power constraint. The problem is solved if the constraint is satisfied; otherwise the problem does not have an admissible solution and one is forced to go to the second step. In the second step, one can prioritize users’ transmission requests, modify (decrease) some rate requirements according to the corresponding latency requirements, or settle with a suboptimal channel/power allocation to accommodate the total power constraint. Which of these options is chosen depends on other system design considerations and the final solution is likely to be obtained by an outer iterative process. As far as this paper is concerned, however, the total transmit power constraint will not be discussed henceforth. In the next section, we adopt a DP approach to derive a simple and practical solution which requires much lower complexity than that of [4] and, more importantly, offers near-optimal performance.

### 3.2

### Dynamic Programming based Near-optimal

### Re-source Allocation

When d = 1 the optimal solution to (4) can be obtained by a water-filling process (for parallel Gaussian channels). The water-filling level, however, is difficult to determine. We present a very efficient algorithm called OMPA in Appendix A. Hence if the channel assignment is known, one can determine each user’s optimal power allocation by using the proposed OMPA algorithm.

For the general case (d 6= 1), an obvious optimal solution to (4) is the exhaustive search over all possible channel assignments with the associated power allocation ma-trices computed by the OMPA algorithm (or water-filling method) to satisfy all users’ rate requirement. Although this algorithm is guaranteed to yield the optimal solution,

the searching process is prohibitively complicated, especially if the numbers of users and/or subcarriers are large. An improvement is suggested in [4] which first determines the “water-filling” levels and the channels for each user. Overbooking of channels is inevitable as every one wants the best channels. A complicated process is thus needed to resolve such conflicts and recompute the “water-filling” levels iteratively. Although optimal solution can be found, the complexity is still very high and is practical for small N and d only (e.g., the case N = 8, d = 2 was given in [4]). Although [1] considered a discrete-rate scenario the authors relaxed the discrete constraint to find a lower-bound solution of (4) iteratively. The quantized version of this solution gives a suboptimal subcarrier allocation, {C(j) : 1 ≤ j ≤ d}, where C(j) is the jth user’s serving-channel set (SCS) that consists of the indices of the assigned channels. A single-user rate (bit) allocation algorithm is then applied to each C(j). Numerical behavior of this approach was shown but no comparison with the optimal performance was given.

Other earlier suboptimal proposals [5], [6], [8] for solving (3.4) start with some initial subcarrier (channel) allocation and assign remaining available subcarriers sequentially according to some ad hoc criterion. Since a given channel has different GNRs when serving different users, [8] gives a channel to the user with strongest gain, i.e., the ith subcarrier is assigned to the kth user if k = arg max1≤j≤daij. However, the ordering of

the subcarriers or the user is arbitrary and it is highly likely that the best channels for two users are the same, say channel k, but the second best channel for the first user is much better than that for the second user. When the first user obtains channel k the second user can only use its second best channel which is much worse than channel k. If instead, the first user is given its second best channel which is not much worse than channel k while the second user is assigned channel k then the overall performance (required total power) will be much improved. On the other hand, [5] makes an initial SCS size |C(j)| decision based on users’ average channel GNRs and rate requirements Rj’s. The average GNR ignores frequency selectivity and the resulting algorithm is

unlikely to find the optimal solution.

In contrast, our approach begins with the fair initial condition that all users are given the opportunity to take every subcarrier. The proposed channel allocation process consists of a series (N -level) of deletion decisions. At each level, a subcarrier is given to an user and is simultaneously removed from the SCSs of all other users, where the SCS for the jth user at the tth level, Cs

t(j), is the set of all subcarriers allocated to serve user

j then. Obviously, our fair initial condition implies that Cs

0(j) = {1, 2, · · · , N }, ∀ j. We

initially eliminate the constraint Cs

t(i) ∩ Cts(j) = ∅, ∀ i 6= j, t = 0, 1, · · · , N and, at stage

t, impose the constraint that t ∈ Cs

t(j) for only one j (i.e., the tth channel can only

be in one of SCS’s) so that the original single-user-per-subcarrier (SUPS) constraint is eventually re-installed and satisfied. Hence, in a sense what we adopt is a constraint relaxation approach.

In such a sequential assignment process the order of subcarriers may be important as once a subcarrier is assigned, no re-assignment is possible. A reasonable ordering is to sort (re-arrange) the N subcarriers in descending order of their maximum GNR, a∗

i =

max1≤j≤daij such that with the new channel order, channel 1 has the best GNR, followed

by channel 2, channel 3, · · · , etc. Formally, this channel sorting is the permutation µ on the ordered integer set {1, 2, · · · , N } which satisfies the inequality a∗

µ−1_{(1)} > a∗_{µ}−1_{(2)} >

· · · > a∗

µ−1_{(N )}, where µ−1 is the inverse mapping of µ.

Our DP-based algorithm can be described by a d-ary tree in which there are d outgoing branches at the root (initial level) to represent possible assignment of the channel 1. Similarly, every node at any given level (height), say the tth level, has d outgoing branches (to d child nodes), each represents a possible channel-assignment (removal) decision and a tentative channel allocation. The channel allocation is tentative because only t channels are assigned and the remaining N − t channels still belong to all SCSs and unassigned. If we associated each level’s decision with a cost, then at the kth level, we shall assign channel k to user i and remove it from the SCSs of all other users

(branches) if the associated cost is minimized. Such a decision is equivalent to selecting the ith branch emitted from the surviving node at the (k − 1)th level as the survival branch while all other d − 1 branches are terminated.

Given the initial fair channel allocation and the ultimate object of minimizing the required power, the cost for a decision at any level should be the minimum required power for the corresponding tentative channel allocation. Hence if we define the SCS collection at the tth level as Cs

t = (Cts(1), · · · , Cts(d)), then the corresponding cost function Jt is

Jt(Cst) =

X

j

g(Rj; Cts(j)) (3.5)

in which each g(Rj; Cts(j)) is determined by applying the OMPA algorithm to solve the

problem

Given C_{t}s(j), find g(Rj; Cts(j)) = min

X i∈Cs t(j) pij s.t. X i∈Cs t(j) rij ≥ Rj. (3.6) Cs

t(j) for each j is modified at each level so that the subcarrier and power assignment

pro-cess is guaranteed to end at the N th level. As the minimum required power g(Rj; Cts(j))

for each j is a decreasing function of the cardinality |Cs

t(j)| of its SCS, the cost Jt is an

increasing function of t. At each level, however, we find the removal of the subcarrier from all but one Cs

t(j) that results in minimum cost (total power) increase. As the

collection {Cs

t(j)} = Cst allows multiple channel assignments, i.e., Cts(j) ∩ Cts(k) 6= ∅, if

j 6= k and t < N , it does not satisfy the constraints (3) of a legitimate channel assign-ment matrix. But as the subcarriers are assigned to users one by one, at the end of the N th level, {Cs

N(j)} = CsN will correspond to a legitimate one. Therefore, the metric

de-fined by (5)-(6) is simply the minimum total transmit power for a given rate-subcarrier assignment with various degrees of relaxation on the SUPS constraint.

Since a path in the tree that visits the jth child node at the kth level implies a channel assignment that gives the kth subcarrier to the jth user, an N -level path would

represent a complete channel allocation. But not all paths are legitimate for a path may assign no serving-channel to an user. In particular, if at the end of the tth level there are still more than N − t users without any serving-channel, i.e., whose SCS cardinality is equal to N − t, then there will be at least one user with an empty SCS at the end of the N th level. To avoid such a possibility and rule out all illegitimate channel assignments, we modify the cost function as

Jt(Cst) = min
1≤k≤d
( _{d}
X
j=1
g(Rj, Cts(j; k) )
+ωt
" _{d}
X
j=1
δ(N − t − |Cs
t(j; k)|)
#)
def
= min
1≤k≤dJ
k
t(Cst) (3.7)
where
C_{t}s(j; k) = C
s
t−1(j) , j = k
Cs
t−1(j) \ {t} , j 6= k
(3.8)
δ(x) = 1 , x = 0
0 , otherwise
and
ωt(x) = 0_{∞ ,}, x ≤ N − t
x > N − t (3.9)

By adding the weight function wt(·) in the cost function, we avoid continuously assigning

channels to some users while other users might not be able to obtain any channel, although the probability of such an event is almost zero so long as N > d and the GNR distributions {aij, i = 1, 2, · · · , N } for each user are independent.

The resulting DP-based resource allocation (DPRA) algorithm, unlike other ap-proaches [1][8][6], accomplishes channel and power (rate) allocations simultaneously and is listed in Table 3.1. Early terminations and computational complexity reduction are possible if certain conditions are satisfied; see Guidelines 4, 5 in the next section.

Step 1: (Channel-sorting) Given N, d, aij and R, find

a∗

i = max1≤j≤daij and re-arrange the channel

indexes by decreasing magnitude of the maximum GNR such that a∗

1 > a∗2 > · · · > a∗N

with the new channel indexes. Step 2: (Initial channel allocation)

Set Cs

0(j) = {i | 1 ≤ i ≤ N }, for 1 ≤ j ≤ d.

Step 3: (Sequential channel-power-rate assignment)
for t = 1 : N
k∗_{=arg min}
1≤k≤dJtk(Cst)
Jt(Cst) = Jk
∗
t (Cst)
for j = 1 : d
if j = k∗ _{then} _{C}s
t(j) = Ct−1s (j)
else Cs
t(j) = Ct−1s (j)\{t}
end
end

Step 4: (Output) The final channel allocation is the N th level SCS collection Cs

N. The power-rate

allocation is obtained while computing JN(CsN)

through (5)-(7).

Table 3.1: A dynamic programming based resource allocation (DPRA) algorithm

### 3.3

### An Optimal Resource Allocation Algorithm

The N -level tree shown in Fig. 1 is a graphic representation of the solution space of (4). The tree contains all possible–legitimate or illegitimate–channel assignments. At each level we allocate a channel so that each “complete” path L from the root node to a leaf node represents a candidate assignment and can be denoted by L = (b1, b2, · · · , bN),

where bi is the ith node visited by the path and, for brevity, the initial (root) node is

not included in the notation. A partial path ln= (b1, b2, · · · , bn), n < N is thus defined

as the part of a complete path that starts at the root node and ends at some internal node.

Searching over the complete tree can certainly lead to the optimal solution but the complexity is of exponential order. The DPRA algorithm calls for the elimination of d−1 child nodes at each level and promises to finish the tree-search process in N stages. As will be shown in Section V, this approach is very efficient in that it yields near-optimal solution with low complexity. However, there is no guarantee that the optimal solution will be obtained as it is possible that the optimal channel assignment path is discarded somewhere along the way, especially if SNR is low. Many other techniques can be used to reduce the prohibitive high complexity of searching the total solution space. We employ a simple linear programming technique called branch-and-bound (B&B) which has the potential of significant complexity reduction if the bounds are properly chosen. Besides presenting novel tight bounds, we also suggest a subcarrier-sorting procedure, which is crucial in reducing the search complexity, use a good initial upper bound and derive some useful properties and guidelines for further complexity reductions.

### 3.3.1

### A Branch-and-bound Approach

In the search tree shown in Fig. 1, each parent node has d child nodes to enclose all possible solutions. Similar to our description of the DPRA algorithm, a path in Fig. 1 that passes through the jth child node at the kth level of the tree (i.e., bk = j) represents

a channel assignment that gives the kth subcarrier to the jth user and a (complete) path is legitimate only if it visits every candidate child node at least once. The B&B paradigm needs an upper bound Bu on any v(lN) (i.e., legitimate channel assignment or complete

path) and a lower bound Bl(lt) associated with each partial path lt of length t. The

use of the upper bound for the cost (minimum required total power) combined with the lower bound which represents the current best solution value (associated with a partial path) enables the algorithm to prune the non-promising subtrees rooted at certain nodes and search parts of the complete tree only. These bounds should be updated as soon as possible to accelerate the searching process but the initial upper bound often plays an important role in the reducing the search complexity. A weak bound will not be capable of eliminating many visits to nodes that lie outside of the correct (optimal) path. To find a tight lower bound, we need the following fundamental definition.

Definition 1. The node value (cost) v(lt) of an internal node of the search tree is

defined by (7) with each Cs

t(j) obtained by removing from C0s(j) the channels that have

been assigned to other users along the partial path lt that ends at the current node.

Obviously, the node value so defined is a function of the node and the associated partial path. We thus denote the node value by v(lt) to emphasize such a dependence. To

see that the node value is indeed a lower bound, we first notice that, like the cost function of the DPRA algorithm, it is a function of a channel allocation that is illegitimate and optimistic. The channel allocation is illegitimate because a subcarrier may be assigned to more than one user and it is optimistic since an user tends to own more than its share of subcarriers, resulting in reduced required power. In the search tree shown in Fig. 1, each parent node has d child nodes to enclose all possible solutions. When we search along a path to visit an internal node of the tree, we compute the associated “node value” by (7) with each Cs

t(j) obtained by removing from C0s(j) the channels that have

been assigned to other users along the partial path lt from the root node to the current

node. Obviously, the node value so defined is a function of the node and the associated

partial path. To emphasize such a dependence, we denote the node value by v(lt). As

an user’s SCS is a decreasing function of the partial path length in the sense that a child node’s SCS is a subset of their parent node’s, the node value of a child node must be equal to or greater than that of its parent node. In other words, the fact

C_{0}s(j) = C(j) ⊃ C_{1}s(j) ⊃ C_{2}s(j) ⊃ · · · ⊃ C_{N}s(j), ∀ j (3.10)

implies

Property 1. Both g(Rj, Cts(j)) and the cost function Jt(Cst) defined by (3.7) are

in-creasing functions of t.

As every complete path is associated with a sequence of shrinking SCSs {Cs 0(j),

Cs

1(j), · · · CNs(j)} and CNs(j) is the cost of this path, we have

Property 2. The node value v(lt) defined by (7) is a lower bound for the cost of any

complete path that coincides with the t-level partial path lt.

Thus, if Jt at a parent node is not smaller than the upper bound, we are sure that

there is no optimal solution in its child nodes and one should check other nodes of the same level. On the other hand, the order of visiting the d child nodes of a parent node should be based on their node values as the node value represent our current best es-timate of all subsequent assignments. For the convenience of subsequent reference, we summarize these two observations, which often brings about significant search complex-ity reduction (see Table 3.3) of a B&B-based resource allocation (BBRA) algorithm, as

Guideline 1. The order of visiting d child nodes of a given parent node should be the same as the ascending order of the magnitudes of the corresponding node values. In other words, one should visit the node with the least node value, followed by the one with second smallest node value, and so on.

Guideline 2. When visiting a node (say at the tth level) of a partial path lt, we compute

the node value v(lt) and compare it with the current upper bound Bu. If v(lt) < Bu then

visit its first child node in the next level. Otherwise, searching on the subtree rooted at this node is terminated and the search should continue on the next unvisited child node of the same level or backtrack to the next unexplored nodes in the previous level, where the order of d siblings descending from the same parent node is determined by Guideline 1.

Because only a complete path corresponds to a candidate solution, the depth-first-search (DFS) strategy is suitable for our B&B approach. The initial upper bound B0 u

can be obtained by the DPRA algorithm. The ensuing DFS searching procedure tries to continuously separate the parent space into the subproblem (child) space. Therefore, we have

Guideline 3. Upon arriving at the final level, we check the resulting cost (node value) to see if Bt

u has to be updated. We then backtrack to the nearest parent node determined

by Guideline 1 and resume the searching process.

Note that the above three Guidelines are valid for general B&B approaches and are listed for the convenience of subsequent discussions.

Definition 1 and the above guidelines all assume that we compute the node values when transversing along a path based on the same principle used by the DPRA algo-rithm. In other words, every user is given all channels initially and, at each level along a path, a channel is assigned to the user associated with the selected child node and removed from all other users’ SCSs. Such a procedure will not exclude any legitimate solution from the tree search. With this assumption, we note that the node values along a path may reach a steady state before the leaf node is visited. A necessary condition is

Property 3. Further traversing on a path will not change the node value if the set of remaining unassigned channels T

jCts(j) = CU satisfy either (i) ∀ i ∈ CU ⇒ rij = 0, ∀ j,

or (ii) ∀ i ∈ CU, i ∈ Cts(j) for only one j.

This property can be used to accelerate our search without missing the optimal solution.

Guideline 4. Besides those terminations specified by Guideline 2, early termination (of a path) is possible if one of the conditions in Property 3 is satisfied.

Since computing the node value via (7) requires repeated calls to the OMPA subrou-tine, the search complexity can be reduced if we can minimize the numbers of calls. A careful examination of (7) and the search procedure reveals

Guideline 5. In computing the node value for the kth child node of a (t − 1)th level parent node, the fact Cs

t(k) = Ct−1s (k) implies that g(Rk, Cts(k; k)) = g(Rk, Ct−1s (k)).

Furthermore, if in computing the parent node’s value we have rtj = 0 for some j, then

g(Rj, Cts(j)) = g(Rj, Ct−1s (j)). For both cases there is no need to call the OMPA

subrou-tine to compute the minimum required power. Finally, although for a fixed k, d OMPA calls are needed in computing each cost g(Rj, Cts(j; k)), d − 1 of them can be reused for

other k’s.

The last two guidelines can be used to reduce the computing complexity of the DPRA algorithm as well. In particular, Guideline 5 implies that only d OMPA calls are needed to compute d child node values of a given parent node.

### 3.3.2

### Sorting The Serving Channels

We have suggested a channel ordering for the DPRA algorithm according to the maxi-mum GNR’s. This channel indexing is simple but, according to our simulation, does not yield fast convergence. Like the DPRA algorithm, the order of the channels is very im-portant. If our ordering (indexing) of the channels is such that the ith (i < N ) channel is so “bad” that it is not used in the final optimal solution (no user really wants it) then we have to check all its d child nodes in the next level. Simulations indicate that the

channel ordering affects the search speed significantly. In view of Guideline 1, Property 2 and given we have decided the first k channels, the (k + 1)th channel should be the most demanded one such that its assignment to a user (thus is removed from the SCSs of all other users) increases the costs (node values) of all other users most significantly. The channel-sorting algorithm based on this idea, is presented in Table 3.4.

We have several remarks on the above channel-sorting process.

R1. The sole purpose of this algorithm is channel-sorting and the corresponding channel assignments are auxiliary operations, not to be realized.

R2. Step 2 in Table 3.4 defines the most demanding channel as the one that offers the highest sum rate and is requested by two or more users given the current SCS collection. When a channel offers the highest rate but serves only one user, it must render relatively low GNR for all other users, hence the decision of its order in the tree should be postponed.

R3. Step 4 deals with the ordering of those channels which, after several rounds of filtering the most demanded channels, are still requested by one user only.

With this channel-sorting procedure and in view of the properties and guidelines men-tioned before along with our definition of the node value, we propose the BBRA algo-rithm of Table 3.2.

### 3.3.3

### Complexity Reduction Techniques

To explore the effectiveness of various techniques implied by properties and guidelines
on reducing the computing complexity, we have performed 106 _{simulated runs of the}

BBRA algorithm that incorporates (1) the channel-sorting process in Table 3.4 and the combinations of (2) Guideline 1, (3) Property 3 and (4) the fifth Guideline. The numbers of users and channels are 5 and 128, respectively, and the normalized rate for each user

Step 1: (Initialization) Use the DPRA algorithm to obtain the initial upper bound B0

u and the

channel-sorting process in Table 3.4, to rearrange the channel order. Set the initial level at ℵ = 1 Step 2: Visit the child nodes of the ℵth level according

to Guideline 1 and invoke Guideline 2. Set ℵ ← ℵ + 1 if no backtracking is needed; otherwise set ℵ ← ℵ − 1.

Property 3 should be used at every node visited to check the possibility of early termination of a candidate path.

Step 3: Go to Step 2 if ℵ < N . If ℵ = N then terminate the searching process if all nodes have been visited or been excluded from further

consideration; otherwise invoke Guideline 3. Set ℵ ← ℵ − 1 and go to Step 2.

Table 3.2: A branch and bound based resource allocation (BBRA) algorithm

d = 5, N = 128 DPRA (1) (1)+(2) (1)+(3) (1)+(4) (1)+(2)+(3)+(4)

E[nop] 44.61 2587.2 773.98 578.09 116.19 88.32

E[nop|nop < 2 × 105] 44.61 1717.8 773.98 549.91 93.78 88.32

Prob[nop> 2 × 105] 0 0.0012 0 0.00005 0.00004 0

max{nop} 81 21622894 180132 1800602 981053 587

Table 3.3: The effects of (1) channel-sorting in Table 3.4, (2) Guideline 1, (3) Property
3/Guideline 4, and (4) Guideline 5 on the computing complexity reduction of the BBRA
algorithm; 106 _{runs are performed to obtain the statistics. The complexity is measured}

in terms of numbers of calls nop to the OMPA algorithm. The complexity of the DPRA

algorithm is also included for comparison purpose.

is uniformly distributed in [0, 3]. The results are summarized in Table 3.3 with the complexity measured in terms of the number of calls to the OMPA algorithm.

Channel-sorting is most critical for with other conventional channel-sorting methods
(e.g., that used by DPRA), the searching complexity often becomes greater than 106_{.}

Hence it is always assumed as part of the initialization step in the BBRA algorithm. The reuse of existing OMPA results (i.e., Guideline 5) also brings about significant reduction as it is applicable in every node visit. Proper branching and early terminations help accelerating the search process a lot as well.

### 3.3.4

### Application to Integer Constellation Systems

With minor modifications, our algorithms remain valid and are applicable for solving a similar RA problem with integer constellation (discrete-rate) constraints. All we to have to do is inserting an SNR gap, which depends on the constellation size and the BER requirement, in the rate-power equation (2) and replacing the OMPA (water-filling procedure) algorithm by a known bit-loading algorithm, e.g., Campello’s optimal algorithm whose complexity is upper-bounded by O(N ) [3].

A B&B approach was also suggested in [13] to solve a similar problem for integer con-stellation systems. Besides not having the attributes mentioned in the second paragraph of this section, their method differs from ours in at least two major aspects. First, their approach implies a tree structure that grows a (dM + 1)-ary sub-tree out of each node where M is the number of discrete rates allowed while we need only a d-ary sub-tree. In other words, [13] converts both user and rate selections into node selections, each node represents a fixed user/rate assignment for a given subcarrier but our tree search has to do with user selection only. Second, each node value (lower bound) of [13] is obtained by solving a linear programming problem after relaxing three major constraints, namely, (i) the SUPS, (ii) the single-rate-per-subcarrier, and (iii) the discrete-rate constraints. The first two relaxations are directly related to their tree structure and the last relaxation