A QoS-Aware and Energy-Conserving Transcoding Proxy Using On-demand Data Broadcasting

(1)

A QoS-Aware and Energy-Conserving Transcoding Proxy Using On-demand Data Broadcasting

Jiun-Long Huang Ming-Syan Chen, Fellow, IEEE Department of Computer Science Department of Electrical Engineering

National Chiao Tung University National Taiwan University Hsinchu, Taiwan, ROC Taipei, Taiwan, ROC E-mail: [email protected] E-mail: [email protected]

Abstract

Most research works in transcoding proxies in mobile computing environments are on the basis of the traditional client-server architecture and do not employ the data broadcast technique. In addition, the issues of QoS provision and energy conservation are also not addressed in the prior studies. In view of this, we design in this paper a QoS-aware and energy-conserving transcoding proxy by utilizing the on-demand broadcasting technique. We first propose a QoS-aware and energy-conserving transcoding proxy architecture, abbreviated as QETP, and model it as a queueing network consisting of three queues. By analyzing the queueing network, three lemmas are derived to estimate the load these queues. We then propose a version decision policy and a service admission control scheme to provide QoS in QETP. The derived lemmas are used to guide the exe- cution of the proposed version decision policy and service admission control scheme to achieve the given QoS requirement. In addition, we also propose a data indexing method to reduce power consumption of clients. To measure the performance of the proposed architecture, three experiments are conducted. Experimental results show that the average access time reduction of the proposed scheme over traditional client-server architecture ranges from 45% to 75%. Experimental results also show that the proposed scheme is more scalable than traditional client-server architecture and is able to effectively control the system load to attain the given QoS requirements. In addition, the proposed scheme is able to greatly reduce average tuning time of clients at the cost of a slight increase (around 5% in our experiments) in average access time.

Key words: Transcoding proxy, QoS, energy-conservation, data broadcast, on-demand broadcast

(2)

1 Introduction

In a pervasive computing environment, due to the constraints resulting from power-limited mobile devices and low-bandwidth wireless networks, designing a power conserving mobile information system with high scalability and high bandwidth utilization becomes an important research issue, and hence attracts a significant amount of research attention. In addition, the high diversity in the capabilities of various mobile devices such as display capabilities (e.g., screen size, color depth and supported data formats) and computation power makes the design of mobile information systems more challenging.

This diversity also results in an increasing demand on the capability of context awareness for mobile information systems.

Content adaptation, which is an important technique to realize context awareness, emerges to rem- edy the problem resulting from the said diversity by offering different mobile users suitable versions of the same object according to the capabilities of the mobile devices, the traffic of the networks and the users’ preferences [20]. Transcoding, which transforms a data object from one version into another, is recognized as a promising technique to realize content adaptation [20][21][23]. A proxy capable of transcoding (referred to as a transcoding proxy) is placed between a client and an information server to coordinate the mismatch between what the server provides and what the client prefers. Since proxy- based approaches are transparent to the content providers and users, this kind of approach is able to simplify the design of servers and clients, and as a result, attracts much research attention.

In recent years, data broadcast [2][3][29] has been employed as an important technique to design a scalable and power conserving mobile information system. However, most research works in transcoding proxies in mobile computing environments are on the basis of the traditional client-server architecture and do not employ the data broadcast technique. Hence, the transcoding proxies are not scalable and the network bandwidth is not well utilized. In addition, most prior studies do not consider the issue of quality of service (abbreviated as QoS) which is crucial in a mobile computing environment.

In addition, as shown in [26], only a modest improvement (20% ∼ 30%) in battery lifetime is ex- pected in the next few years. Hence, energy conservation is raised as a key factor of the design of mobile devices. Since data indexing is recognized as a promising means to reduce power consumption [17], many researchers have studied the design of data indexing algorithms in push-based data broad-

(3)

casting environments [9][22][28][30]. However, most studies on on-demand data broadcasting focus on the design of scheduling algorithms [1][3], and only a few of them consider the employment of data indexing in on-demand data broadcasting environments [18].

In view of this, we design in this paper a scalable, QoS-aware and energy-conserving transcoding proxy by utilizing the on-demand broadcasting technique. Explicitly, we first propose a QoS-aware and energy-conserving transcoding proxy architecture, abbreviated as QETP, and model it as a queueing network with three queues. By analyzing the queueing network, three lemmas are derived to formulate the average waiting time of these queues. We then devise scheme ODB-QoS-Index to provide QoS in QETP where ODB-QoS-Index stands for “On-demand Data Broadcasting with QoS and data Indexing.”

Scheme ODB-QoS-Index is an online, iterative and adaptive algorithm comprising

1. a version decision policy to determine the suitable version for each data request according to the users’ device profiles and the state of the server,

2. a service admission control scheme to determine whether to grant a service registration or a service handoff according to the state of the server, and

3. a data indexing method to insert data indices into the broadcast program to reduce power consumption of clients.

In each iteration, scheme ODB-QoS-Index estimates the average waiting time of each queue based on the derived results, determines the state of each queue according to the corresponding estimation of average waiting time, and configures the behavior of the version decision policy and the service admission control scheme in accordance with the states of these queues to attain the desired QoS. In addition, scheme ODB-QoS-Index inserts index items into the broadcast program to reduce the clients’ power consumption. To measure the performance of QETP, three experiments are conducted. Experimental results show that the average access time reduction of the proposed scheme over traditional client-server architecture ranges from 45% to 75%. Experimental results also show that scheme ODB-QoS-Index is more scalable than traditional client-server architecture, and is able to achieve the system administrators’ QoS requirements by the devised version decision policy and the service admission control scheme. In addition, scheme ODB-QoS-Index is able to greatly reduce average tuning time at the cost

(4)

Mobile Information System

Data Request Queue

Data Request Data

Object

Notebook PDA Tablet PC

Figure 1: An example on-demand broadcasting system

D₁ D₂ D₃ D₄ D₅

A Time

Broadcast Program

t

(a) Without data indexing

I₁

D₁ D₂ D₃ I₂ D₄ D₅

A A A

D D Time

Broadcast Program

t

(b) With data indexing

Figure 2: Employment of data indexing

of a slight increase (around 5% in our experiments) in average access time. Access time is defined as the summation of time periods from the moment that mobile clients submit data requests to the moment that mobile clients receive the requested data items. On the other hand, tuning time is defined as the summation of time periods that mobile clients operate in active mode. Access time is widely used to evaluate the efficiency of broadcast systems, while tuning time is used to evaluate power consumption of mobile devices. To the best of our knowledge, there is no prior research on the design of transcoding proxies employing data broadcast. This feature distinguishes this paper from others.

The rest of this paper is organized as follows. The descriptions of related work and the proposed transcoding proxy architecture, QETP, are given in Section 2. An analytical model and a transcoding model are devised in Section 3. Then, Section 4 describes the proposed version decision policy, service admission control scheme and data indexing method. The performance evaluation is shown in Section 5, and finally, Section 6 concludes this paper.

2 Preliminaries

2.1 On-demand Data Broadcasting

Figure 1 shows an example on-demand broadcasting system. In an on-demand data broadcasting system [1][3][4], a server maintains a data request queue and serves these requests according to the employed scheduling algorithm. When requiring one data item, a mobile client sends a data request to the server.

(5)

After receiving a data request, the server first checks whether there exists another data request in the data request queue with the same required data object. If yes, the new-coming data request is merged into that data request. This phenomenon is called request merge. Data requests with the same requested data object can be safely merged since one transmission of the data object in a broadcast channel is able to serve all merged data requests. Therefore, the higher the occurrence probability of request merge is, the more efficient the system is. Otherwise, the new-coming data request is inserted into the data request queue.

A scheduling algorithm is used to prioritize all data requests in the data request queue, and the server will serve these data requests according to their priorities. To serve a data request, the system retrieves the required data object from the corresponding data server, and then broadcasts this object to all its clients via a dedicated and shared broadcast channel. As a result, the on-demand broadcast system is more scalable and can obtain higher network utilization than traditional client-server architecture.

2.2 Related Work

2.2.1 Prior Work Related to On-demand Data Broadcasting

Dykeman et al. pointed out in [10] that traditional FCFS scheduling would produce long average access time for an on-demand broadcast system when the access frequencies of all data items were not uniformly distributed. They proposed several scheduling algorithms and concluded that LWF could provide the best performance among the proposed algorithms. Aksoy et al. pointed out in [3] that although being able to produce the shortest average access time, LWF is not efficient when the number of data requests is large. To address this problem, they proposed algorithm RxW which is able to schedule the received data requests efficiently by employing a pruning technique. Experimental results showed that the performance (i.e., average access time) of RxW is close to that of LWF. Unfortunately, the algorithm RxW is designed under the premise that each data item is of the same size. Hence, it is not suitable for variable-sized data items. In [1], Acharya et al. addressed the broadcast scheduling problem in the environments with variable-size data items. They defined a new metric, stretch, as the ratio of the response time of a request to its service time. Based on stretch, they proposed a scheduling algorithm, called LTSF, to minimize the stretch. Wu et al. argued that algorithm LTSF is not optimal

(6)

I_k(1) I_k(2) I_k(d) D_k(1) D_k(2) D_k(d)

IS_k DS_k

Bucket_k

Figure 3: Index structure

in terms of overall stretch [27]. In addition, algorithm LTSF is not scalable in a large-scale environment. Therefore, they proposed a scheduling algorithm to optimize the system performance in terms of stretch. Moreover, the proposed scheduling algorithm is more scalable than LTSF, and hence, is suitable for practical use.

However, most studies on on-demand data broadcasting focus on the design of scheduling algorithms [1][3], and only a few of them consider the employment of data indexing in on-demand data broadcasting environments [18]. Figure 2a and Figure 2b show the examples that a mobile client is- sues a data request at time t on broadcast programs without and with data indexing, respectively. In Figure 2a and Figure 2b, the time periods marked as ‘A’ and ‘D’ indicate that the time periods that the mobile device is in active and doze mode, respectively. Since the sizes of index items are much smaller than those of data items, employing data indexing is able to greatly reduce the average tuning time at the cost of a slight increase in the average access time.

In [18], Lee et al. proposed a data indexing method in an on-demand data broadcasting environment.

As shown in Figure 3, the proposed broadcast program is partitioned into a series of buckets and each bucket contains an index segment and a data segment. The number of the index items in an index segment is equal to the number of data items in the corresponding data segment in the same bucket.

In bucket B_k, the i-th index item (i.e., I_k(i)) contains (1) the identifier and the version number of the corresponding data item in bucket B_k (i.e., D_k(i)), (2) the time offset that D_k(i) will be broadcast and (3) the size of D_k(i). The number of index items within an index segment is called the degree of the broadcast program. In [18], the degree of all buckets are the fixed, and the experimental results suggest to set degree of broadcast programs to two for better performance.

(7)

2.2.2 Prior Work Related to Transcoding Proxy

Han et al. proposed in [13] an image transcoding proxy which is able to control the data retrieval time to meet users’ requirements. The proposed transcoding proxy can adaptively adjust the sizes of the objects transmitted to users by using an aggressive lossy compression method. They also presented an analytical framework for determining whether to transcode and how much to transcode an image, and a process used by the transcoding proxy to adapt its image coding to meet an upper bound on the delay tolerated by the end user.

In [7], Cardellini et al. analyzed how network proxies can work collaboratively in content transcoding and caching. They proposed a distributed algorithm to distribute the computation load caused by transcoding throughout a collaborative proxy system. They also proposed two extended strategies to cache data objects. In [8], Chang et al. explored the aggregate effect when caching multiple versions of the same Web object in the transcoding proxy. They argued that the aggregate profit of caching multiple versions of an object is not simply equal to the sum of the profits of caching individual versions, but rather, depends on the transcoding relationships among them. They devised the notion of a weighted transcoding graph and formulated a generalized profit function. Based on the weighted transcoding graph and the generalized profit function, an innovative cache replacement algorithm for transcoding proxies was proposed, and the proposed cache replacement algorithm was shown to perform well in terms of the delay saving ratios and cache hit ratios.

Hsiao et al. proposed the architecture of versatile transcoding proxy in [14]. Based on the concept of the agent system, the proposed architecture can accept and execute the transcoding preference script provided by the client or the server to transform the corresponding data or protocol according to the user’s specification. Fine granularity control is achieved by building a weighted transcoding graph which depicts the transcoding relationship among transcodable versions dynamically. Based on the weighted transcoding graph, the transcoding proxy performs cache replacement according to the content in the caching candidate set, which is generated by the concept of dynamic programming.

In the early study [15] of this paper, we proposed a QoS-aware transcoding proxy architecture to use on-demand broadcast to transmit the requested data objects. However, the issue of energy conservation is not considered. Therefore, for energy conservation, we in this paper extend the prior architecture to

(8)

Storage

Scheduler Service

Manager

Cache Manager Internet

Server

Cell Cell

Service Area Back End

Transcoder

Service

Manager Scheduler

Front End Front End

Figure 4: The architecture of QETP

support data indexing techniques. In addition, we also revise the version decision policy and the service admission control scheme proposed in [15] for better performance.

2.3 System Architecture

Figure 4 shows the proposed architecture of QETP. In a cellular environment, the whole service area of a mobile environment is divided into a number of cells. Two dedicated channels, one control channel and one broadcast channel, are provided in each cell. A control channel is used to transmit control messages such as registration messages, data requests, acknowledgements, and so on. On the other hand, a broadcast channel is used by the transcoding proxy to disseminate data objects to its clients. In according to the locations of these components, QETP comprises the following two types of components:

front-end and back-end.

A front-end, which comprises a service manager and a scheduler, is allocated to each cell. These two components are described below.

• Service Manager: A service manager is in charge of all service-related operations such as service registration, service termination, service admission control and so on. Each service manager owns a profile database storing the users’ profiles and the profiles of these users’ devices.

(9)

• Scheduler: A scheduler is a software component which handles the data requests of the corre- sponding cell. After receiving a data request, the scheduler will first determine a suitable version for this data request according to the user’s device profile and the network state. Then, the scheduler will check whether the received data request can be merged to an existing data request in the data request queue. Different from the traditional on-demand broadcasting architecture described in Section 2.1, request merge occurs only when there exists another data request in the data request queue asking for the same version of the same required data object of the received data request. Otherwise, the scheduler will insert the received data request into the data request queue.

In addition, a scheduling algorithm is employed to determine the service order of the data requests in the data request queue. While serving a data request, the scheduler will send this request to the cache manager and the cache manager will respond with the content of the required data object. The scheduler then broadcasts the required data object via the broadcast channel, and serves the next data request in the data request queue. Moreover, scheduler will broadcast index items through the broadcast channel to reduce the power consumption of mobile clients.

A back-end, which comprises a cache manager and a transcoder, behaves like a traditional transcoding proxy. These two components are described below.

• Cache Manager: After receiving a data request from a scheduler, the cache manager is responsi- ble for returning the required version of the required data object to the scheduler. Suppose that the cache manager receives a data request of the j-th version of data object D(i). If the j-th version of D_i is cached, the cache manager will return the cached data object to the scheduler immediately. If the j-th version of Diis not cached, the cache manager will check whether there exists another version of D_iwhich can be transcoded into the j-th version of D_i. If yes, the cache manager will ask the transcoder to generate the j-th version of D_i. Otherwise, the cache manager will request the original version of the requested data object from the data server, ask the transcoder to transform the returned data object into the required version, and then transmit the result of transcoding to the scheduler.

(10)

• Transcoder: A transcoder is in charge of the transformation of data objects among different versions according to the received transformation requests generated by the cache manager.

Since the design of the back-end is similar to the systems proposed in some prior works [7][8][13][25], we focus in this paper on the design of the front-end.

3 Analytical and Transcoding Models

3.1 Analytical Model

In this subsection, we derive the worst case of the average access time¹ of QETP, and use the derived results to propose a version decision policy and a service admission control scheme in Section 4. To facilitate the following discussion, we first make the following assumptions.

1. The employed scheduling scheme of the scheduler is FCFS (standing for first come, first serve).

2. No request merge occurs in the data request queue of the scheduler.

3. One transmission of a data object in the broadcast channel is received by exactly one client.

4. The messages of registration, de-registration and handoff are negligible.

Assumptions 2 and 3 occur when the users’ interests are highly diverse, and hence the effect of on- demand broadcast diminishes. We make these two assumptions since we focus on the worst case of the transcoding proxy. Assumption 4 is made since we focus on the situation that the number of data requests is much higher than the number of control messages (i.e., registration, de-registration, handoff and service termination). These assumptions will be relaxed in our simulation model. For better readability, a list of used symbols is shown in Table 1.

We model QETP as a queueing network as shown in Figure 5. Queue 2 is a physical queue which is located in the scheduler. On the contrary, Queue 1 and Queue 3 are logical queues which are only used to model the control and broadcast channels in order to derive the average waiting time of a data request on the control and broadcast channels, respectively. Suppose that the data requests submitted

1In this paper we use access time and waiting time exchangeably.

(11)

Symbol Description P_i i-th device profile

D_j(k) k-th version of data item D_j N_{U ser} Number of users in the cell λCtrl. Aggregate request rate in the cell µCtrl. Service rate of the control channel µSche. Service rate of the cache

µBCast. Service rate of the broadcast channel

ρSche. Standard deviation of the service time of the cache B_Ctrl. Bandwidth of the control channel

B_BCast Bandwidth of the broadcast channel

Table 1: Description of symbols

M/M/1 M/G/1

G/M/1

Control Channel

Scheduler

BroadcastChannel

Data Request

Data Request Data

Data

Queue 1 Queue 2

Queue 3

ACK ACK

Figure 5: The analytical model of the proposed transcoding proxy

by a mobile user i follow a Poisson process with rateλi, and N_{U ser} is the number of mobile users in the cell. To facilitate the following discussion, we number the mobile users in the cell as user 1, 2,

· · ·, N_{U ser}. Due to the characteristic of the Poisson process, the aggregate data requests of all mobile users in the cell follow a Poisson process with rateλCtrl.=∑^N_i=1^{U ser}λi. Denote the sizes of data requests and request acknowledgements as s_Ctrl. and s_Ack., respectively. Also let B_Ctrl. be the bandwidth of the control channel, and let the waiting time of the control channel for a data request (denoted as W_Ctrl.) be the time interval between the user sending a data request and the user receiving the acknowledgement.

Then, we have the following lemma.

(12)

Lemma 1: The average waiting time of the control channel is

W_Ctrl.= 1

BCtrl.

s_Ctrl.+s_Ack.−λCtrl.

.

Proof: Similar to [19], we assume that the average waiting time to transmit a data request and a request acknowledgement by the control channel is an exponential distribution with mean _µ¹

Ctrl.. Hence, the control channel can be modeled as an M/M/1 queue. Then, the average service rate of the control channel is

µCtrl.= B_Ctrl.

s_Ctrl.+ s_Ack..

Omitting the equation manipulation which can be found in [12], the approximated average waiting time for each mobile device from submitting a data request to receiving the corresponding request acknowledgement is

W_Ctrl.= 1

µCtrl.−λCtrl.

= 1

B_Ctrl.

s_Ctrl.+s_Ack.−λCtrl.

. (1)

Q.E.D.

Let the waiting time of the scheduler for a data request (denoted as W_Sche.) be, from the scheduler’s perspective, the time interval from the arrival of the data request to the time that the requested data object has been obtained. Since the service time of a cache manager is affected by several factors such as cache status of the required data objects, the employed replacement scheme, the characteristics of the input jobs, and so on, the service time of the cache manager cannot be modeled by a particular mathematical distribution. Therefore, we model the average service time of the cache manager as an arbitrary distribution with mean _µ¹

Sche. and varianceσ_Sche.² ^{. Let}ρSche.=_µ^λ^Ctrl.

Sche. be the load of the scheduler.

We then have the following lemma.

Lemma 2: The average waiting time of the scheduler is

W_Sche.= 1 µSche.

+

ρSche.

µSche.+λCtrl.σ_Sche.² 2(1 −ρSche.) .

Proof: With assumptions 1, 2 and the characteristic of M/M/1 queues, the input process seen by the data request queue of the scheduler is also a Poisson process with rate λCtrl.. When receiving a data

(13)

request, the scheduler determines the most suitable version of the requested data object according to the profile of the mobile device and network status, and then inserts the corresponding job (including data object id and the most suitable version number) into the data request queue. To serve a data request, the scheduler passes the job to the cache manager, and the cache manager will retrieve the specified version of the data object requested by the job and return the retrieved data object to the scheduler. Then, the scheduler disseminates the returned data object to its clients via the broadcast channel.

With assumption 2, the processing of the scheduler can be modeled as an M/G/1 queue. Then, as shown in [12], the expected system size in steady-state is

L_Sche.=ρSche.+ρ_Sche.² ⁺λ_Ctrl.² σ_Sche.² 2(1 −ρSche.) .

By Little’s formula, the average waiting time of this queue is

W_Sche.=L_Sche.

λCtrl.

= 1

µSche.

+

ρSche.

µSche. +λCtrl.σ_Sche.²

2(1 −ρSche.) . (2)

Q.E.D.

Let the waiting time of the broadcast channel for a data request be the time interval from the time that the requested data object has been obtained by the scheduler to the time that the user has received it. Then, we have the following lemma.

Lemma 3: The average waiting time of the broadcast channel is

W_BCast = 1

µBCast(1 − r₀),

where r₀is the root of z = A^∗[µBCast(1 − z)] with value larger than zero and less than one.

Proof: Similar to the proof of Lemma 1, we assume that the average waiting time of the broadcast channel follows an exponential distribution with mean _µ ¹

BCast. Since the broadcast channel is a dedicated downlink channel, similar as [19], we have

1 µBCast

=Average size of the incoming data objects BBCast

. (3)

(14)

As shown in Figure 5, the input process of the broadcast channel is the output process of the scheduler.

Since the service time of the scheduler (i.e., Queue 2) is an arbitrary distribution, the output process of the scheduler does not follow a particular mathematical distribution. Suppose that the interarrival time of the input process follows an arbitrary distribution with cumulative distribution function A(t). The broadcast channel can be modeled as a G/M/1 queue. Let A^∗(z) be the Laplace-Stieltjes transform of A(t). Omitting the mathematical manipulation which can be found in [12], the average waiting time of the broadcast channel (denoted as W_BCast) is

W_BCast = 1

µBCast(1 − r0), (4)

where r0is the root of the following equation with value larger than zero and less than one.

z = A^∗[µBCast(1 − z)] (5)

Q.E.D.

Finally, the average waiting time of the whole system (denoted as W_Sys.) is equal to the summation of the average waiting time of the control channel, the scheduler and the broadcast channel. Then, with Lemmas (1), (2) and (3), W_Sys.can be formulated as

W_Sys. = W_Ctrl.+W_Sche.+W_BCast (6)

3.2 Transcoding Model

Suppose that the mobile devices are classified into several categories based on their capabilities, and the capabilities of each category are described by one device profile. Let P_i be the i-th device profile.

Without loss of generality, we order the device profiles according to their capabilities in ascending order. That is, the capability of P_i is better than that of P_j when i > j. We also let D_i( j) be the j-th version of data object D_i. Again, we order all versions of a data object according to their quality in ascending order, which means that the quality of D_i( j) is better than that of D_i(k) when j > k. For each data object, we assume that the data size of a version with higher quality is larger than that of another

(15)

6 5 4 3 2 1

P₃ P₂ P₁

Figure 6: Example device profiles version with lower quality.

To facilitate the following discussion, the concept of viewable version set is defined below.

Definition 1: A viewable version set of a device profile P_iand a data object D_j (denoted as VV S(i, j)) is a set of versions of Djwhich are able to be displayed by mobile devices with profile Pi.

Then, we have the following example.

Example 1: Consider the example shown in Figure 6. Mobile devices are classified into three cat- egories: notebook, PDA and smart phone, and their capabilities are described in device profiles P₃, P₂ and P₁, respectively. In addition, there are six versions of data object D_j. VV S(3, j), VV S(2, j) and VV S(1, j) are {3, 4, 5, 6}, {3, 4} and {1, 2}, respectively. We have VV S(2, j) ⊂ VV S(3, j) since devices with profile P₃ (e.g., notebooks) are capable of displaying all versions of D_j viewable by devices with profile P₂ (e.g., PDAs). On the other hand, we have VV S(3, j)^TVV S(1, j) =φ ^and VV S(2, j)^TVV S(1, j) =φ since devices with profile P₁ (e.g., smart phone) employ special data for- mats (e.g., WML and WBMP) that are not supported by devices with profile P₂and P₃.

Let the function BEST (i, j) = k (respectively, W ORST (i, j) = k) represent that the best (respec- tively, worst) viewable version of data object D_j for a mobile device with device profile P_iis version k.

In practice, we have BEST (i, j) ≥ BEST (l, j) and W ORST (i, j) ≥ W ORST (l, j) when i > l. We also have BEST (i, j) = max {VV S(i, j)} and W ORST (i, j) = min {VV S(i, j)}.

Example 2: Consider the example shown in Figure 6. The best viewable versions of P₃, P₂ and P₁ are D_j(6), D_j(4) and D_j(2), respectively. As a result, we have BEST (3, j) = 6, BEST (2, j) = 4 and BEST (1, j) = 2. In addition, we also have W ORST (3, j) = 3, W ORST (2, j) = 3 and W ORST (1, j) = 1.

(16)

Average Waiting Time Estimation Start

Next iteration Version Decision Policy

Configuration

Service Admission Control Scheme Configuration

Figure 7: The flowchart of scheme ODB-QoS- Index

LIGHT FAIR

System Load (ρ)

Average Access Time (W) HEAVY

ρ₁ ρ₂ 1

Figure 8: The relationship between load and average access time of a queue

When a user registers the service, the user’s mobile device will transmit the identifications of the user and the corresponding device profile to the server. Suppose that the device profile of the mobile device is P_i. Then, when the mobile user requests D_j, the server will return a suitable version of D_j, say the k-th version of D_j where k ∈ VV S(i, j), according to the result of the underlying version decision policy.

4 Design of Scheme ODB-QoS-Index

An overview of scheme ODB-QoS-Index is given in Section 4.1. The proposed version decision policy and admission control scheme are described in Section 4.2 and Section 4.3, respectively. Finally, the description of the proposed data indexing method is given in Section 4.4.

4.1 Overview

In this paper, we take the average waiting time of the system as the QoS metric. Before executing scheme ODB-QoS-Index, system administrators should specify a QoS requirement by setting two thresholds of average access time, W₁ and W₂ where W₁< W₂. The meanings of these two thresholds are as follows. The users are guaranteed to receive the best viewable versions of the requested data objects when the average waiting time is smaller than W₁. On the other hand, scheme ODB-QoS-Index will try its best to prevent the average waiting time from being larger than W₂.

(17)

Scheme ODB-QoS-Index is an online, iterative and adaptive algorithm which comprises a version decision policy, a service admission control scheme and a data indexing method. The flowchart of scheme ODB-QoS-Index is shown in Figure 7. Scheme ODB-QoS-Index is executed periodically, and the following three steps are executed in each iteration. First, in the average waiting time estimation step, scheme ODB-QoS-Index measures the average waiting time of each queue according to the analytical results derived in Section 3. Since only Queue 2 is physical, only the average waiting time of Queue 2 (i.e., W_Sche.) can be directly observed. In view of this, we propose an approximation al- gorithm to estimate the average waiting times of Queue 1 and Queue 3 (i.e., W_Ctrl. and W_BCast). For better readability, the proposed approximation algorithm is described in Appendix A. Then, scheme ODB-QoS-Index measures the load of each queue based on the estimated average waiting time, and determines the current state of each queue according to the load of each queue. Finally, scheme ODB- QoS-Index configures the version decision policy and the service admission control scheme according to the state of each queue. In addition, a data indexing method is employed by the scheduler to insert index items into the broadcast program to reduce power consumption of mobile clients. The details of scheme ODB-QoS-Index are described in the following subsections.

4.2 Version Decision Policy

4.2.1 Overview

Figure 8 shows the relationship between the average waiting time and the load of a queue. It is intuitive that when the load is larger than or equal to one, the system is not stable since the average waiting time does not converge and will approach to infinity. In addition, when the load is smaller than one, the average waiting time increases as the load increases, and the increment will increase drastically when the load approaches one.

With the above observations, the rationale of our scheduling algorithm is to keep the system loads of the scheduler (i.e., Queue 2 in Figure 5) and the broadcast channel (i.e., Queue 3 in Figure 5) smaller than the predetermined thresholds at the cost of degrading the quality of requested data objects. As a consequence, when the load of the scheduler or the load of the broadcast channel is high, for each data request, the system will return the version of quality worse than the best viewable version. This

(18)

strategy has the following two effects:

1. Decrease the average waiting time of the broadcast channel (_µ ¹

BCast). Since the data size of a data object with lower quality is usually smaller than that of the same data object with higher quality, transmitting data objects with lower quality is able to reduce the load of the broadcast channel (ρBCast).

2. Increase the occurrence probability of request merge. Consider the device profiles shown in Figure 6, and two data requests of D_j for device profiles P₂and P₃, respectively. These two data requests will not be merged together when the load of the scheduler or the broadcast channel is light since the system will return the best viewable versions of D_j for P₂ and P₃, respectively.

When the load is heavy, the system decides to return the third version of Dj. Hence, these two data requests can be merged together, and the arrival rates of the input processes of the cache and the broadcast channel decrease. As a result, this strategy is able to reduce the load of the cache (ρSche.) and the broadcast channel (ρBCast).

The proposed version decision policy consists of three phases: state determination phase, candidate version selection phase and version decision phase. First, in state determination phase the server determines the states of the scheduler and the broadcast channel according to the loads of the scheduler and the broadcast channel. Then, in candidate version selection phase, several versions, called candidate versions, are selected according to the states of the scheduler and the broadcast channel. Finally, the server decides the resultant version from the candidate versions according to the content of the request queue and the objects stored in the cache.

4.2.2 State Determination Phase

Two thresholds,ρ₁^Sche.^andρ₂^Sche.(respectively,ρ₁^BCast ^andρ₂^BCast), are specified to divide the load of the scheduler (respectively, the broadcast channel) into three states: LIGHT, FAIR and HEAVY. Figure 9 shows the state transition diagram of the scheduler. The state transition scenarios are as follows. When the previous state is LIGHT, the current state will transit to FAIR ifρSche.> (1+α)×ρ₁^Sche.. Otherwise, the current state will still be LIGHT. When the previous state is FAIR, the current state will transit to

(19)

FAIR1 FAIR2

) 1 ( ) 1

( ^.

.

Sche

Sche α ρ

ρ > + × ρ_Sche_.>(1+α)×ρ^Sche^.(2)

) 2 ( ) 1

( ^.

.

Sche

Sche α ρ

ρ < − × )

1 ( ) 1

( ^.

.

Sche

Sche α ρ

ρ < − ×

otherwise otherwise

FAIRn ) 1 ( ) 1

( ^.

.> + × ^Sche n−

Sche α ρ

ρ

) 1 ( ) 1

( ^.

.< − × ^Sche n−

Sche α ρ

ρ

otherwise

FAIR

. 1

. (1 ) ^Sche

Sche α ρ

ρ > + × ρSche.>(1+α)×ρ2^Sche^.

. 2

. (1 ) ^Sche

Sche α ρ

ρ < − ×

. 1

. (1 ) ^Sche

Sche α ρ

ρ < − × LIGHT otherwise

HEAVY otherwise

Figure 9: State transition diagram

LIGHT whenρSche.< (1−α^)×ρ₁^Sche.and transit to HEAVY whenρSche.> (1+α^)×ρ₂^Sche.. Otherwise, the current state will still be FAIR. When the previous state is HEAVY, the current state will transit to FAIR if ρSche.< (1 −α^{) ×}ρ₂^Sche.. Otherwise, the current state will still be HEAVY. The factor α^, where 0 <α < 1, is used to avoid state oscillation. We assume that (1 +α) ×ρ₂^Sche. < 1 without loss of generality. To facilitate fine-grained control, system administrators can divide FAIR state into several sub-states. Suppose that there are n sub-states of FAIR state. The interval (ρ₁^Sche.^,ρ₂^Sche.^{) is then} divided into n partitions by n − 1 thresholds, ρ^Sche.^(1),ρ^Sche.(2), · · · ,ρ^Sche.(n − 1), where ρ^Sche.^{(k) =}

³ρ₁^Sche.+ k ×⁽^ρ²^Sche.⁻_n^ρ¹^Sche.⁾

´

. The transition between these sub-states is similar to that between LIGHT, FAIR and HEAVY states. The state transition diagram and transition scenarios of the broadcast channel are as shown in Figure 9 by substituting ρ₁^BCast ând ρ₂^BCast ^for ρ₁^Sche. ând ρ₂^Sche., respectively. The determination of the values ofρ₁^Sche.^,ρ₂^Sche.^,ρ₁^BCast ândρ₂^BCast is described in Appendix B.

We also define the aggregate state of the scheduler and the broadcast channel as follows. The aggregate state is LIGHT when the loads of the scheduler and the broadcast channel are both LIGHT.

The aggregate state is HEAVY when at least one of the loads the scheduler and broadcast channel is HEAVY. Otherwise, the aggregate state is FAIR. In FAIR state, the current sub-state is determined as the heaviest of the current sub-states (i.e., the heaviest load) of the scheduler and the broadcast channel. For each new-coming data request, the scheduler will decide a suitable version, fill the version information into the data request according to the aggregate state, and insert it into the data request queue. The scheduler will also inform the mobile client of the decided version by replying an acknowledgement message.

(20)

4.2.3 Candidate Version Selection Phase

Let degradation and maxDegradation indicate the suggested and maximal degrees of degradation, respectively. The value of maxDegradation is determined by

maxDegradation = max

∀P_k,Dj

{BEST (k, j) −WORST (k, j)}.

In candidate version selection phase, the server will determine a proper value of degradation according to the state of the server, and versions BEST (k, j), BEST (k, j) − 1, · · · , BEST (k, j) − degradation are called candidate versions. The procedure in candidate version select phase is described below.

• Case I: Aggregate state is LIGHT.

The scheduler operates in the traditional on-demand broadcast mode when the aggregate state is LIGHT. Hence, the server guarantees that each client will receive the best viewable versions of the requested data objects. That is, the system will return the BEST (i, j)-th version of Dj when a user requests D_j by a mobile device belonging to device profile P_i. Therefore, the value of degradation is set to zero.

• Case II: Aggregate state is FAIR.

In FAIR state, the quality of the received data objects may be degraded. Suppose that FAIR state consists of n sub-states. Then, the value of degradation is set to dk ×maxDegradation

n+1 e when the server is in the k-th sub-state of FAIR state.

• Case III: Aggregate state is HEAVY

When the aggregate state is HEAVY, the server will suggest to return the W ORST (i, j)-th version of Djwhen a user requests Dj by a mobile device belonging to device profile Pi. Therefore, the value of degradation is set to maxDegradation.

4.2.4 Version Decision Phase

In this phase, the server should pick a proper one from candidate versions (i.e., BEST (i, j), BEST (i, j)−

1, · · · , BEST (i, j) − degradation). Suppose that the incoming request is for Di. The steps of the decision

(21)

are as follows.

• Step I: In this step, the server checks the data requests in the request queue. If in request queue, there is a data request for D_i, say Req, with version v, BEST (i, j) ≤ v ≤ BEST (i, j) − degradation, version v is selected since this incoming request can be merged into Req without increasing the load of the server. The server will perform step two if there is no such data request in the request queue.

• Step II: In this step, the server checks the objects stored in the cache. If there is an object D_i(v), BEST (i, j) ≤ v ≤ BEST (i, j) − degradation, stored in cache, version v is selected so that the server need not neither retrieve D_vfrom its data server nor perform transcoding. Otherwise, the server will perform step three if there is no such object in the cache.

• Step III: Select the version v which is covered by the most profiles among versions BEST (i, j), BEST (i, j) − 1, · · · , BEST (i, j) − degradation. Although the server load cannot be reduced by this decision, the probability that successive requests can perform request merge will increase.

4.3 Service Admission Control Scheme

The proposed service admission control scheme consists of two phases: state determination phase and admission control phase. To perform service admission control, the server first determines the state of the control channel in state determination phase, and then determines whether to grant a service registration or a service handoff in admission control phase. The procedures of these two phases are described in the following subsections.

4.3.1 State Determination Phase

The proposed service admission control scheme is employed in each service manager to determine whether to grant a service registration or a service handoff by considering the number of users in service, the network status, and so on. The rate that a service registration is blocked is called service blocking rate (abbreviated as SBR), while the rate that a service handoff is forced to terminate is called service dropping rate (abbreviated as SDR). The rationale of our service admission control scheme is to

(22)

keep the system load of the control channel (i.e., Queue 1 in Figure 5) smaller than the predetermined thresholds at the cost of increasing SBR and SDR. To achieve this, two thresholds, ρ^Ctrl.₁ ^and ρ^Ctrl.₂ where ρ₁^Ctrl. ^< ρ₂^Ctrl. < 1, are specified to divide the load of the control channel into three states:

LIGHT, FAIR and HEAVY. The state transition diagram and transition scenario of the service manager are shown in Figure 9 by substitutingρ₁^Ctrl. ândρ^Ctrl.₂ ^forρ₁^Sche.ândρ₂^Sche., respectively. Similarly, the determination ofρ₁^Ctrl. ândρ₂^Ctrl.is described in Appendix B.

4.3.2 Admission Control Phase

Although the proposed version decision policy can reduce the loads of the scheduler and the broadcast channel, the effect of the proposed version decision policy is limited since it depends on several factors such as the locality of data requests, the cache size and so on. As a consequence, in addition to the load of the control channel, the service admission control scheme should also take the loads of the scheduler and the broadcast channel into consideration. The procedure in admission control phase is as below.

When the load the control channel is HEAVY, the server will block all service registrations and drop all service handoffs in order to relieve the server load. When the load of the control channel is FAIR or LIGHT, the server will determine the values of two probabilities, Prob_Block and Prob_Drop. Then, a service registration will be blocked with probability Prob_Block, while a service handoff will be dropped with probability Prob_Drop. It is the system administrators’ responsibility to specify how to determine of the values Prob_Block and ProbDrop. Let curState_Ctrl. be the current state of the control channel, and let curState_Agg.be the aggregate state of the scheduler and the broadcast channel. Note that SBR should be sacrificed first since mobile users can tolerate a service registration being blocked rather than a service handoff being forced to terminate (i.e., dropped). Therefore, in each combination of curState_Ctrl. and curStateAgg., Prob_Blockshould be larger than or equal to ProbDrop. An example setting for determining Prob_Block and Prob_Dropin an environment with three sub-states in FAIR state is given in Table 2.

Consider the case that the server decides to reject a service registration of a service handoff since the server’s load cannot afford it. If the owner of the service registration or the service handoff, say user i, has the same interest to other users using the service, granting this service registration or the service handoff will not increase the server load since all the user i’s requests are expected to be able to

(23)

curState_Agg.

LIGHT FAIR

HEAVY FAIR₁ FAIR₂ FAIR₃

curState_Ctrl. LIGHT 0/0 0/0 0.33/0 0.66/0.15 1/0.3 FAIR 0/0 0.25/0 0.5/0 0.75/0.3 1/0.6 Prob_Block/Prob_Drop Table 2: An example setting for determining Prob_Blockand Prob_Drop

be merged to other users’ requests. Hence, to decrease SBR and SDR, the server should grant user i’s service registration or service handoff. From the above example, we observe that we can aggressively grant a server registration or a service handoff as long as the owner and other users are of common interest.

To measure the similarity of interest of user i and other users using the service, we define similarity factor as the probability that a user’s request will be merged to another request. When receiving a data request, the server will check whether the data request is merged into another request and update the user’s similarity factor stored in the user’s profile. The system administrators have to specify a threshold β^{, 0 ≤}β ≤ 1, so that a service registration or a service handoff will be granted (even the server cannot afford it) as long as the value of the owner’s similarity factor is larger than or equal toβ^.

4.4 Data Indexing

As shown in [18], setting degree of broadcast programs to a smaller value will make mobile devices meet index segments more quickly, thus reducing energy consumption. However, it is true only in the cases that turning on and turning off WNIs do not consume energy. As pointed out in [24], in reality turning on and turning off the WNIs consume some time and energy, and the transition times of a WNI from active mode to doze mode and from doze mode to active mode are both on the order of tens milliseconds.

Consider two organizations of index and data items shown in Figure 10. Note that the time periods marked as ‘A’ and ‘D’ indicate the time periods that the mobile device is in active and doze mode, respectively, while the time periods marked as ‘F’ and ‘N’ indicate that the time periods that the mobile device in turning off and turning on the wireless network interfaces (abbreviated as WNIs). Suppose that a mobile device tunes to the broadcast channel at time t_Start and finishes the retrieval of the desired

(24)

I₁ I₂ I₃ I₄ D₁ D₂ D₃ D₄

A F N A

t_Start Retrieval of D₃

t_End

D Time

(a) Example broadcast program with degree four

I₁ D₁ I₂ D₂ I₃ D₃ I₄ D₄

A F N A F N A

Time Retrieval

of D₃

t_End

t_Start _D _D

(b) Example broadcast program with degree one

Figure 10: Example organizations of index and data items

data item at time t_End. As observed in Figure 10, when the value of degree of broadcast programs decreases, mobile devices will switch back and forth between active and doze modes (i.e., turn on and turn off WNIs) more frequently, and therefore, may consume more energy. As a result, the value of degree of broadcast programs should be set to a proper value to minimize energy consumption of mobile devices.

In view of this, we adopt an adaptive data indexing method [16] which is able to dynamically adjust the degree of broadcast programs according to system workload. The employed data indexing method consists of two phases, statistics collection phase and degree adjustment phase, and switches back and forth between these two phases periodically. In statistics collection phase, the system collects the arrival time, finish time and other statistical information of each served data request. Then, in the successive degree adjustment phase, the server determines a proper value of degree of broadcast programs according to the collected information. For the interest of space, we omit the description of the determination of the value of degree of broadcast programs. Interested readers can refer to [16] for details.

After determining the current value of degree of broadcast programs, the server then generates the broadcast program accordingly. Since the data items are of different sizes, we use the parameter budget, which is defined as the maximal length of the data segments of all buckets, to control the length of each bucket. Initially, the bucket is empty and the scheduler fetches as many data items as possible from the cache under the constraint that the summation of the sizes of the fetched data items is

(25)

smaller than or equal to budget. In addition, the scheduler marks the fetched data items as LOCKED.

Then, the scheduler inserts the corresponding index items in front of these data items. Finally, the scheduler broadcasts the index and data items in the bucket sequentially. An index item or a data item is removed from the bucket once it has been broadcast. In addition, the state of a data item which has been broadcast is marked as UNLOCKED. The above procedure repeats until the bucket becomes empty. To employ data indexing, the cache replacement policy should be also modified to consider only data items in UNLOCKED states as the candidates to be replaced.

4.5 Remarks

Currently, the proposed version decision policy and service admission control scheme are designed on the goal of reducing the overall average waiting time and average tuning time. Therefore, if two users submit two data requests (each user submits one request) for the same data object at the same time, their priorities and version numbers will be the same.

It is possible to implement differentiated QoS control in the proposed architecture. For example, we can add a classifier in front of the scheduler to classify the received data requests according to some administrator-specified rules. Hence, the version decision policy is able to assign their version numbers according to their classes. In addition, when processing a service registration or a service handoff, the server first classifies the service according to the user’s profile, and then takes action according to the user’s class. Consider the case that the server receives two service registrations. Suppose that one is submitted by a VIP user, and the other is submitted by a normal user. The latter will be rejected if the server can accept only one service registration.

5 Performance Evaluation

To evaluate the performance of scheme ODB-QoS-Index, we build an event-driven simulator with SIM [5]. In order to measure the reduction of power consumption of scheme ODB-QoS-Index, we also implement scheme ODB-QoS which only employs the proposed version decision policy and service admission control scheme. Both scheme ODB-QoS-Index and scheme ODB-QoS are executed period-