A QoS-aware and energy-conserving transcoding proxy using on-demand data broadcasting

(1)

A QoS-Aware and Energy-Conserving

Transcoding Proxy Using

On-Demand Data Broadcasting

Jiun-Long Huang and Ming-Syan Chen, Fellow, IEEE

Abstract—Most research works in transcoding proxies in mobile computing environments are on the basis of the traditional client-server architecture and do not employ the data broadcast technique. In addition, the issues of QoS provision and energy conservation are also not addressed in the prior studies. In view of this, we design in this paper a QoS-aware and energy-conserving transcoding proxy by utilizing the on-demand broadcasting technique. We first propose a QoS-aware and energy-conserving transcoding proxy architecture, abbreviated as QETP, and model it as a queuing network consisting of three queues. By analyzing the queuing network, three lemmas are derived to estimate the load of these queues. We then propose a version decision policy and a service admission control scheme to provide QoS in QETP. The derived lemmas are used to guide the execution of the proposed version decision policy and service admission control scheme to achieve the given QoS requirement. In addition, we also propose a data indexing method to reduce the power consumption of clients. To measure the performance of the proposed architecture, three experiments are conducted. Experimental results show that the average access time reduction of the proposed scheme over the traditional client-server architecture ranges from 45 percent to 75 percent. Experimental results also show that the proposed scheme is more scalable than the traditional client-server architecture and is able to effectively control the system load to attain the given QoS requirements. In addition, the proposed scheme is able to greatly reduce the average tuning time of clients at the cost of a slight increase (around 5 percent in our experiments) in average access time.

Index Terms—Transcoding proxy, QoS, energy-conservation, data broadcast, on-demand broadcast.

Ç

1 I

NTRODUCTION

I

N a pervasive computing environment, due to the

constraints resulting from power-limited mobile devices and low-bandwidth wireless networks, designing a power conserving mobile information system with high scalability and high bandwidth utilization becomes an important research issue and, hence, attracts a significant amount of research attention. In addition, the high diversity in the capabilities of various mobile devices such as display capabilities (e.g., screen size, color depth, and supported data formats) and computation power makes the design of mobile information systems more challenging. This diver-sity also results in an increasing demand on the capability of context awareness for mobile information systems.

Content adaptation, which is an important technique to realize context awareness, emerges to remedy the problem resulting from the said diversity by offering different mobile users suitable versions of the same object according to the capabilities of the mobile devices, the traffic of the networks, and the users’ preferences [20]. Transcoding, which transforms a data object from one version into

another, is recognized as a promising technique to realize content adaptation [20], [21], [23]. A proxy capable of transcoding (referred to as a transcoding proxy) is placed between a client and an information server to coordinate the mismatch between what the server provides and what the client prefers. Since proxy-based approaches are transpar-ent to the conttranspar-ent providers and users, this kind of approach is able to simplify the design of servers and clients and, as a result, attracts much research attention.

In recent years, data broadcast [2], [3], [29] has been employed as an important technique to design a scalable and power conserving mobile information system. How-ever, most research works in transcoding proxies in mobile computing environments are on the basis of the traditional client-server architecture and do not employ the data broadcast technique. Hence, the transcoding proxies are not scalable and the network bandwidth is not well utilized. In addition, most prior studies do not consider the issue of quality of service (abbreviated as QoS), which is crucial in a mobile computing environment.

In addition, as shown in [26], only a modest improve-ment (20 to 30 percent) in battery lifetime is expected in the next few years. Hence, energy conservation is raised as a key factor of the design of mobile devices. Since data indexing is recognized as a promising means to reduce power consumption [17], many researchers have studied the design of data indexing algorithms in push-based data broadcasting environments [9], [22], [28], [30]. However, most studies on on-demand data broadcasting focus on the design of scheduling algorithms [1], [3], and only a few of

. J.-L. Huang is with the Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan, ROC.

E-mail: [email protected].

. M.-S. Chen is with the Electrical Engineering Department, National Taiwan University, Taipei, Taiwan, ROC.

E-mail: [email protected].

Manuscript received 22 Oct. 2005; revised 21 Apr. 2006; accepted 13 Nov. 2006; published online 7 Feb. 2007.

For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TMC-0307-1005. Digital Object Identifier no. 10.1109/TMC.2007.1038.

(2)

them consider the employment of data indexing in on-demand data broadcasting environments [18].

In view of this, we design in this paper a scalable, QoS-aware and energy-conserving transcoding proxy by utiliz-ing the on-demand broadcastutiliz-ing technique. Explicitly, we first propose a QoS-aware and energy-conserving transcod-ing proxy architecture, abbreviated as QETP, and model it as a queuing network with three queues. By analyzing the queuing network, three lemmas are derived to formulate the average waiting time of these queues. We then devise scheme ODB-QoS-Index to provide QoS in QETP, where ODB-QoS-Index stands for “On-demand Data Broadcasting with QoS and data Indexing.” Scheme ODB-QoS-Index is an online, iterative, and adaptive algorithm comprising

1. a version decision policy to determine the suitable version for each data request according to the users’ device profiles and the state of the server,

2. a service admission control scheme to determine whether to grant a service registration or a service handoff according to the state of the server, and 3. a data indexing method to insert data indices into

the broadcast program to reduce the power con-sumption of clients.

In each iteration, scheme ODB-QoS-Index estimates the average waiting time of each queue based on the derived results, determines the state of each queue according to the corresponding estimation of average waiting time, and configures the behavior of the version decision policy and the service admission control scheme in accordance with the states of these queues to attain the desired QoS. In addition, scheme ODB-QoS-Index inserts index items into the broad-cast program to reduce the clients’ power consumption. To measure the performance of QETP, three experiments are conducted. Experimental results show that the average access time reduction of the proposed scheme over tradi-tional client-server architecture ranges from 45 percent to 75 percent. Experimental results also show that scheme ODB-QoS-Index is more scalable than the traditional client-server architecture and is able to achieve the system administrators’ QoS requirements by the devised version decision policy and the service admission control scheme. In addition, scheme ODB-QoS-Index is able to greatly reduce the average tuning time at the cost of a slight increase (around 5 percent in our experiments) in average access time. Access time is defined as the summation of time periods from the moment that mobile clients submit data requests to the moment that mobile clients receive the requested data items. On the other hand, tuning time is defined as the summation of time periods that mobile clients operate in active mode. The access time is widely used to evaluate the efficiency of broadcast systems, while the tuning time is used to evaluate the power consumption of mobile devices. To the best of our knowledge, there is no prior research on the design of transcoding proxies employ-ing data broadcast. This feature distemploy-inguishes this paper from others.

The rest of this paper is organized as follows: The descriptions of related work and the proposed transcoding proxy architecture, QETP, are given in Section 2. An analytical model and a transcoding model are devised in

Section 3. Then, Section 4 describes the proposed version decision policy, service admission control scheme, and data indexing method. The performance evaluation is shown in Section 5 and, finally, Section 6 concludes this paper.

2 P

RELIMINARIES

2.1 On-Demand Data Broadcasting

Fig. 1 shows an example on-demand broadcasting system. In an on-demand data broadcasting system [1], [3], [4], a server maintains a data request queue and serves these requests according to the employed scheduling algorithm. When requiring one data item, a mobile client sends a data request to the server. After receiving a data request, the server first checks whether there exists another data request in the data request queue with the same required data object. If yes, the new-coming data request is merged into that data request. This phenomenon is called request merge. Data requests with the same requested data object can be safely merged since one transmission of the data object in a broadcast channel is able to serve all merged data requests. Therefore, the higher the occurrence prob-ability of request merge is, the more efficient the system is. Otherwise, the new-coming data request is inserted into the data request queue.

A scheduling algorithm is used to prioritize all data requests in the data request queue, and the server will serve these data requests according to their priorities. To serve a data request, the system retrieves the required data object from the corresponding data server and then broadcasts this object to all its clients via a dedicated and shared broadcast channel. As a result, the on-demand broadcast system is more scalable and can obtain higher network utilization than the traditional client-server architecture. 2.2 Related Work

2.2.1 Prior Work Related to On-Demand Data Broadcasting

Dykeman et al. pointed out in [10] that traditional FCFS scheduling would produce a long average access time for an on-demand broadcast system when the access frequencies of all data items were not uniformly distributed. They proposed several scheduling algorithms and concluded that LWF could provide the best performance among the

(3)

proposed algorithms. Aksoy and Franklin pointed out in [3] that, although it is able to produce the shortest average access time, LWF is not efficient when the number of data requests is large. To address this problem, they proposed algorithm RxW, which is able to schedule the received data requests efficiently by employing a pruning technique. Experimental results showed that the performance (i.e., average access time) of RxW is close to that of LWF. Unfortunately, the algorithm RxW is designed under the premise that each data item is of the same size. Hence, it is not suitable for variable-sized data items. In [1], Acharya and Muthukrishnan addressed the broadcast scheduling pro-blem in the environments with variable-sized data items. They defined a new metric, stretch, as the ratio of the response time of a request to its service time. Based on stretch, they proposed a scheduling algorithm, called LTSF, to minimize the stretch. Wu and Cao argued that algorithm LTSF is not optimal in terms of overall stretch [27]. In addition, algorithm LTSF is not scalable in a large-scale environment. Therefore, they proposed a scheduling algo-rithm to optimize the system performance in terms of stretch. Moreover, the proposed scheduling algorithm is more scalable than LTSF and, hence, is suitable for practical use.

However, most studies on on-demand data broadcasting focus on the design of scheduling algorithms [1], [3] and only a few of them consider the employment of data indexing in on-demand data broadcasting environments [18]. Fig. 2a and Fig. 2b show examples where a mobile client issues a data request at time t on broadcast programs without and with data indexing, respectively. In Fig. 2a and Fig. 2b, the time periods marked “A” and “D” indicate the time periods where the mobile device is in active and doze mode, respectively. Since the sizes of index items are much smaller than those of data items, employing data indexing is able to greatly reduce the average tuning time at the cost of a slight increase in the average access time.

In [18], Lee et al. proposed a data indexing method in an on-demand data broadcasting environment. As shown in Fig. 3, the proposed broadcast program is partitioned into a series of buckets and each bucket contains an index segment and a data segment. The number of the index items in an index segment is equal to the number of data items in the corresponding data segment in the same bucket. In bucket Bk, the ith index item (i.e., IkðiÞ) contains 1) the

identifier and the version number of the corresponding data

item in bucket Bk (i.e., DkðiÞ), 2) the time offset that DkðiÞ

will be broadcast, and 3) the size of DkðiÞ. The number of

index items within an index segment is called the degree of the broadcast program. In [18], the degree of all buckets is fixed, and the experimental results suggest setting the degree of broadcast programs to two for better performance. 2.2.2 Prior Work Related to Transcoding Proxy

Han et al. proposed in [13] an image transcoding proxy which is able to control the data retrieval time to meet users’ requirements. The proposed transcoding proxy can adap-tively adjust the sizes of the objects transmitted to users by using an aggressive lossy compression method. They also presented an analytical framework for determining whether to transcode and how much to transcode an image and a process used by the transcoding proxy to adapt its image coding to meet an upper bound on the delay tolerated by the end user.

In [7], Cardellini et al. analyzed how network proxies can work collaboratively in content transcoding and caching. They proposed a distributed algorithm to distribute the computation load caused by transcoding throughout a collaborative proxy system. They also proposed two extended strategies to cache data objects. In [8], Chang and Chen explored the aggregate effect when caching multiple versions of the same Web object in the transcoding proxy. They argued that the aggregate profit of caching multiple versions of an object is not simply equal to the sum of the profits of caching individual versions, but rather, depends on the transcoding relationships among them. They devised the notion of a weighted transcoding graph and formulated a generalized profit function. Based on the weighted transcoding graph and the generalized profit function, an innovative cache replacement algorithm for transcoding proxies was proposed, and the proposed cache replacement algorithm was shown to perform well in terms of the delay saving ratios and cache hit ratios.

Hsiao et al. proposed the architecture of versatile transcoding proxy in [14]. Based on the concept of the agent system, the proposed architecture can accept and execute the transcoding preference script provided by the client or the server to transform the corresponding data or protocol according to the user’s specification. Fine granu-larity control is achieved by building a weighted transcod-ing graph which depicts the transcodtranscod-ing relationship among transcodable versions dynamically. Based on the weighted transcoding graph, the transcoding proxy per-forms cache replacement according to the content in the caching candidate set, which is generated by the concept of dynamic programming.

In the early study [15] of this paper, we proposed a QoS-aware transcoding proxy architecture to use on-demand

Fig. 2. Employment of data indexing. (a) Without data indexing. (b) With data indexing.

(4)

broadcast to transmit the requested data objects. However, the issue of energy conservation is not considered. There-fore, for energy conservation, we extend in this paper the prior architecture to support data indexing techniques. In addition, we also revise the version decision policy and the service admission control scheme proposed in [15] for better performance.

2.3 System Architecture

Fig. 4 shows the proposed architecture of QETP. In a cellular environment, the whole service area of a mobile environment is divided into a number of cells. Two dedicated channels, one control channel and one broadcast channel, are provided in each cell. A control channel is used to transmit control messages such as registration messages, data requests, acknowledgments, and so on. On the other hand, a broadcast channel is used by the transcoding proxy to disseminate data objects to its clients. According to the locations of these components, QETP comprises the follow-ing two types of components: front-end and back-end.

A front-end, which comprises a service manager and a scheduler, is allocated to each cell. These two components are described below.

. Service Manager: A service manager is in charge of all service-related operations such as service registra-tion, service terminaregistra-tion, service admission control, and so on. Each service manager owns a profile database storing the users’ profiles and the profiles of these users’ devices.

. Scheduler: A scheduler is a software component which handles the data requests of the corresponding cell. After receiving a data request, the scheduler will first determine a suitable version for this data request according to the user’s device profile and the network state. Then, the scheduler will check whether the received data request can be merged to an existing data request in the data request queue. Different from the traditional on-demand broadcasting architecture described in Section 2.1, request merge occurs only when there exists another data request in the data request queue asking for the same version of the same required data object of the received data request. Otherwise, the scheduler will insert the received data request into the data request queue.

In addition, a scheduling algorithm is employed to determine the service order of the data requests in the data request queue. While serving a data request, the scheduler will send this request to the cache manager and the cache manager will respond with the content of the required data object. The scheduler then broadcasts the required data object via the broadcast channel and serves the next data request in the data request queue. Moreover, the scheduler will broadcast index items through the broadcast channel to reduce the power consumption of mobile clients. A back-end, which comprises a cache manager and a transcoder, behaves like a traditional transcoding proxy. These two components are described below.

. Cache Manager: After receiving a data request from a scheduler, the cache manager is responsible for returning the required version of the required data object to the scheduler. Suppose that the cache manager receives a data request of the jth version of data object DðiÞ. If the jth version of Diis cached, the

cache manager will return the cached data object to the scheduler immediately. If the jth version of Diis

not cached, the cache manager will check whether there exists another version of Di which can be

transcoded into the jth version of Di. If yes, the

cache manager will ask the transcoder to generate the jth version of Di. Otherwise, the cache manager

will request the original version of the requested data object from the data server, ask the transcoder to transform the returned data object into the required version, and then transmit the result of transcoding to the scheduler.

. Transcoder: A transcoder is in charge of the transfor-mation of data objects among different versions according to the received transformation requests generated by the cache manager.

Since the design of the back-end is similar to the systems proposed in some prior works [7], [8], [13], [25], we focus in this paper on the design of the front-end.

3 A

NALYTICAL AND

T

RANSCODING

M

ODELS

3.1 Analytical Model

In this subsection, we derive the worst case of the average access time1of QETP and use the derived results to propose a version decision policy and a service admission control scheme in Section 4. To facilitate the following discussion, we first make the following assumptions:

1. The employed scheduling scheme of the scheduler is FCFS (standing for first come, first served).

2. No request merge occurs in the data request queue of the scheduler.

3. One transmission of a data object in the broadcast channel is received by exactly one client.

4. The messages of registration, deregistration, and handoff are negligible.

1. In this paper, we use access time and waiting time interchangeably. Fig. 4. The architecture of QETP.

(5)

Assumptions 2 and 3 occur when the users’ interests are highly diverse and, hence, the effect of on-demand broad-cast diminishes. We make these two assumptions since we focus on the worst case of the transcoding proxy. Assump-tion 4 is made since we focus on the situaAssump-tion where the number of data requests is much higher than the number of control messages (i.e., registration, deregistration, handoff, and service termination). These assumptions will be relaxed in our simulation model. For better readability, a list of used symbols is shown in Table 1.

We model QETP as a queuing network as shown in Fig. 5. Queue 2 is a physical queue which is located in the scheduler. On the contrary, Queue 1 and Queue 3 are logical queues which are only used to model the control and broadcast channels in order to derive the average waiting time of a data request on the control and broadcast channels, respectively. Suppose that the data requests submitted by a mobile user i follow a Poisson process with rate i and NUseris the number of mobile users in the cell.

To facilitate the following discussion, we number the mobile users in the cell as user 1; 2; ; NUser. Due to the

characteristic of the Poisson process, the aggregate data requests of all mobile users in the cell follow a Poisson process with rate Ctrl:¼PNi¼1Useri. Denote the sizes of data

requests and request acknowledgments as sCtrl: and sAck:,

respectively. Also, let BCtrl:be the bandwidth of the control

channel and let the waiting time of the control channel for a data request (denoted as WCtrl:) be the time interval

between the user sending a data request and the user receiving the acknowledgment. Then, we have the follow-ing lemma:

Lemma 1.The average waiting time of the control channel is WCtrl:¼

1

BCtrl:

sCtrl:þsAck: Ctrl:

:

Proof.Similar to [19], we assume that the average waiting time to transmit a data request and a request acknowl-edgment by the control channel is an exponential distribution with mean 1

Ctrl:. Hence, the control channel

can be modeled as an M/M/1 queue. Then, the average service rate of the control channel is

Ctrl:¼

BCtrl:

sCtrl:þ sAck:

:

Omitting the equation manipulation which can be found in [12], the approximated average waiting time for each mobile device from submitting a data request to receiving the corresponding request acknowledgment is

WCtrl:¼ 1 Ctrl: Ctrl: ¼ _B 1 Ctrl: sCtrl:þsAck: Ctrl: : ð1Þ t u Let the waiting time of the scheduler for a data request (denoted as WSche:) be, from the scheduler’s perspective, the

time interval from the arrival of the data request to the time that the requested data object has been obtained. Since the service time of a cache manager is affected by several factors, such as the cache status of the required data objects, the employed replacement scheme, the characteristics of the input jobs, and so on, the service time of the cache manager cannot be modeled by a particular mathematical distribu-tion. Therefore, we model the average service time of the cache manager as an arbitrary distribution with mean 1

Sche:

and variance 2

Sche:. Let Sche:¼Ctrl:

Sche: be the load of the

scheduler. We then have the following lemma:

TABLE 1 Description of Symbols

(6)

Lemma 2.The average waiting time of the scheduler is WSche:¼ 1 Sche: þ Sche: Sche:þ Ctrl: 2 Sche: 2ð1 Sche:Þ :

Proof.With Assumptions 1 and 2 and the characteristic of M=M=1 queues, the input process seen by the data request queue of the scheduler is also a Poisson process with rate Ctrl:. When receiving a data request, the

scheduler determines the most suitable version of the requested data object according to the profile of the mobile device and network status and then inserts the corresponding job (including data object id and the most suitable version number) into the data request queue. To serve a data request, the scheduler passes the job to the cache manager and the cache manager will retrieve the specified version of the data object requested by the job and return the retrieved data object to the scheduler. Then, the scheduler disseminates the returned data object to its clients via the broadcast channel.

With Assumption 2, the processing of the scheduler can be modeled as an M/G/1 queue. Then, as shown in [12], the expected system size in steady-state is

LSche:¼ Sche:þ

2

Sche:þ 2Ctrl:2Sche:

2ð1 Sche:Þ

:

By Little’s formula, the average waiting time of this queue is WSche:¼ LSche: Ctrl: ¼ 1 Sche: þ Sche: Sche:þ Ctrl: 2 Sche: 2ð1 Sche:Þ : ð2Þ t u Let the waiting time of the broadcast channel for a data request be the time interval from the time that the requested data object has been obtained by the scheduler to the time that the user has received it. Then, we have the following lemma:

Lemma 3.The average waiting time of the broadcast channel is WBCast¼

1 BCastð1 r0Þ

;

where r0is the root of z ¼ A½BCastð1 zÞ with value larger

than zero and less than one.

Proof.Similar to the proof of Lemma 1, we assume that the average waiting time of the broadcast channel follows an exponential distribution with mean 1

BCast. Since the

broadcast channel is a dedicated downlink channel, similar to [19], we have

1 BCast

¼Average size of the incoming data objects BBCast

: ð3Þ As shown in Fig. 5, the input process of the broadcast channel is the output process of the scheduler. Since the service time of the scheduler (i.e., Queue 2) is an arbitrary distribution, the output process of the schedu-ler does not follow a particular mathematical distribu-tion. Suppose that the interarrival time of the input

process follows an arbitrary distribution with cumulative distribution function AðtÞ. The broadcast channel can be modeled as a G/M/1 queue. Let A_{ðzÞ be the}

Laplace-Stieltjes transform of AðtÞ. Omitting the mathematical manipulation which can be found in [12], the average waiting time of the broadcast channel (denoted as WBCast) is

WBCast¼

1 BCastð1 r0Þ

; ð4Þ where r0is the root of the following equation with value

larger than zero and less than one: z¼ A_½

BCastð1 zÞ: ð5Þ

t u Finally, the average waiting time of the whole system (denoted as WSys:) is equal to the summation of the average

waiting time of the control channel, the scheduler, and the broadcast channel. Then, with Lemmas 1, 2, and 3, WSys:can

be formulated as

WSys:¼ WCtrl:þ WSche:þ WBCast: ð6Þ

3.2 Transcoding Model

Suppose that the mobile devices are classified into several categories based on their capabilities, and the capabilities of each category are described by one device profile. Let Pi be

the ith device profile. Without loss of generality, we order the device profiles according to their capabilities in ascend-ing order. That is, the capability of Piis better than that of Pj

when i > j. We also let DiðjÞ be the jth version of data object

Di. Again, we order all versions of a data object according to

their quality in ascending order, which means that the quality of DiðjÞ is better than that of DiðkÞ when j > k. For

each data object, we assume that the data size of a version with higher quality is larger than that of another version with lower quality.

To facilitate the following discussion, the concept of viewable version set is defined below:

Defintion 1.A viewable version set of a device profile Piand a

data object Dj(denoted as V V Sði; jÞ) is a set of versions of Dj

which are able to be displayed by mobile devices with profile Pi.

Then, we have the following example:

Example 1.Consider the example shown in Fig. 6. Mobile devices are classified into three categories: notebook, PDA, and smart phone, and their capabilities are described in device profiles P3, P2, and P1, respectively.

In addition, there are six versions of data object Dj.

(7)

V V Sð3; jÞ, V V Sð2; jÞ, and V V Sð1; jÞ are {3, 4, 5, 6}, {3, 4}, and {1, 2}, respectively. We have V V Sð2; jÞ V V Sð3; jÞ since devices with profile P3(e.g., notebooks) are capable

of displaying all versions of Djviewable by devices with

profile P2 (e.g., PDAs). On the other hand, we have

V V Sð3; jÞTV V Sð1; jÞ ¼ and V V Sð2; jÞTV V Sð1; jÞ ¼ since devices with profile P1 (e.g., smart phone)

employ special data formats (e.g., WML and WBMP) that are not supported by devices with profile P2and P3.

L e t t h e f u n c t i o n BEST ði; jÞ ¼ k ( r e s p e c t i v e l y , W ORSTði; jÞ ¼ k) represent that the best (respectively, worst) viewable version of data object Djfor a mobile device

with device profile Pi is version k. In practice, we have

BESTði; jÞ BEST ðl; jÞ and W ORST ði; jÞ W ORST ðl; jÞ when i > l. We also have BEST ði; jÞ ¼ maxfV V Sði; jÞg and W ORSTði; jÞ ¼ minfV V Sði; jÞg.

Example 2.Consider the example shown in Fig. 6. The best viewable versions of P3, P2, and P1are Djð6Þ, Djð4Þ, and

Djð2Þ, respectively. As a result, we have BEST ð3; jÞ ¼ 6,

BESTð2; jÞ ¼ 4, and BEST ð1; jÞ ¼ 2. In addition, we also have W ORST ð3; jÞ ¼ 3, W ORST ð2; jÞ ¼ 3, and W ORSTð1; jÞ ¼ 1.

When a user registers the service, the user’s mobile device will transmit the identifications of the user and the corresponding device profile to the server. Suppose that the device profile of the mobile device is Pi. Then, when

the mobile user requests Dj, the server will return a

suitable version of Dj, say, the kth version of Dj, where

k2 V V Sði; jÞ, according to the result of the underlying version decision policy.

4 D

ESIGN OF

S

CHEME

ODB-QoS-I

NDEX

An overview of scheme ODB-QoS-Index is given in Section 4.1. The proposed version decision policy and admission control scheme are described in Section 4.2 and Section 4.3, respectively. Finally, the description of the proposed data indexing method is given in Section 4.4. 4.1 Overview

In this paper, we take the average waiting time of the system as the QoS metric. Before executing scheme ODB-QoS-Index,

system administrators should specify a QoS requirement by setting two thresholds of average access time, W1 and W2,

where W1< W2. The meanings of these two thresholds are

as follows: The users are guaranteed to receive the best viewable versions of the requested data objects when the average waiting time is smaller than W1. On the other hand,

scheme ODB-QoS-Index will try its best to prevent the average waiting time from being larger than W2.

Scheme ODB-QoS-Index is an online, iterative, and adaptive algorithm which comprises a version decision policy, a service admission control scheme, and a data indexing method. The flowchart of scheme ODB-QoS-Index is shown in Fig. 7. Scheme ODB-QoS-Index is executed periodically, and the following three steps are executed in each iteration. First, in the average waiting time estimation step, scheme ODB-QoS-Index measures the average waiting time of each queue according to the analytical results derived in Section 3. Since only Queue 2 is physical, only the average waiting time of Queue 2 (i.e., WSche:) can be

directly observed. In view of this, we propose an approx-imation algorithm to estimate the average waiting times of Queue 1 and Queue 3 (i.e., WCtrl: and WBCast). For better

readability, the proposed approximation algorithm is described in Appendix A. Then, scheme ODB-QoS-Index measures the load of each queue based on the estimated average waiting time and determines the current state of each queue according to the load of each queue. Finally, scheme ODB-QoS-Index configures the version decision policy and the service admission control scheme according to the state of each queue. In addition, a data indexing method is employed by the scheduler to insert index items into the broadcast program to reduce power consumption of mobile clients. The details of scheme ODB-QoS-Index are described in the following subsections.

4.2 Version Decision Policy 4.2.1 Overview

Fig. 8 shows the relationship between the average waiting time and the load of a queue. It is intuitive that, when the load is larger than or equal to one, the system is not stable since the average waiting time does not converge and will approach infinity. In addition, when the load is smaller than one, the average waiting time increases as the load

Fig. 7. The flowchart of scheme ODB-QoS-Index. Fig. 8. The relationship between load and average access time of a queue.

(8)

increases, and the increment will increase drastically when the load approaches one.

With the above observations, the rationale of our scheduling algorithm is to keep the system loads of the scheduler (i.e., Queue 2 in Fig. 5) and the broadcast channel (i.e., Queue 3 in Fig. 5) smaller than the predetermined thresholds at the cost of degrading the quality of requested data objects. As a consequence, when the load of the scheduler or the load of the broadcast channel is high, for each data request, the system will return the version of quality worse than the best viewable version. This strategy has the following two effects:

1. Decrease the average waiting time of the broadcast channel ð 1

BCastÞ. Since the data size of a data object

with lower quality is usually smaller than that of the same data object with higher quality, transmitting data objects with lower quality is able to reduce the load of the broadcast channel ðBCastÞ.

2. Increase the occurrence probability of request merge. Consider the device profiles shown in Fig. 6 and two data requests of Dj for device profiles P2 and P3,

respectively. These two data requests will not be merged together when the load of the scheduler or the broadcast channel is light since the system will return the best viewable versions of Djfor P2and P3,

respectively. When the load is heavy, the system decides to return the third version of Dj. Hence,

these two data requests can be merged together, and the arrival rates of the input processes of the cache and the broadcast channel decrease. As a result, this strategy is able to reduce the load of the cache ðSche:Þ

and the broadcast channel ðBCastÞ.

The proposed version decision policy consists of three phases: the state determination phase, the candidate version selection phase, and the version decision phase. First, in the state determination phase, the server determines the states of the scheduler and the broadcast channel according to the loads of the scheduler and the broadcast channel. Then, in the candidate version selection phase, several versions, called candidate versions, are selected according to the states of the scheduler and the broadcast channel. Finally, the server decides the resultant version from the candidate versions according to the content of the request queue and the objects stored in the cache.

4.2.2 State Determination Phase Two thresholds, Sche:

1 and Sche:2 (respectively, BCast1 and

BCast

2 ), are specified to divide the load of the scheduler

(respectively, the broadcast channel) into three states: LIGHT, FAIR, and HEAVY. Fig. 9 shows the state transition diagram of the scheduler. The state transition scenarios are as follows: When the previous state is LIGHT, the current state will transit to FAIR if Sche:>ð1 þ Þ Sche:1 .

Other-wise, the current state will still be LIGHT. When the previous state is FAIR, the current state will transit to LIGHT when Sche:<ð1 Þ Sche:1 and transit to HEAVY

when Sche:>ð1 þ Þ Sche:2 . Otherwise, the current state

will still be FAIR. When the previous state is HEAVY, the current state will transit to FAIR if Sche:<ð1 Þ Sche:2 .

Otherwise, the current state will still be HEAVY. The factor , where 0 < < 1, is used to avoid state oscillation. We assume that ð1 þ Þ Sche:

2 < 1without loss of

general-ity. To facilitate fine-grained control, system admin-istrators can divide the FAIR state into several substates. Suppose that there are n substates of the FAIR state. The interval ðSche:

1 ; Sche:2 Þ is then divided into n partitions by

n 1 thresholds, Sche:_ð1Þ;Sche:_{ð2Þ; ;}Sche:_{ðn 1Þ, where}

Sche:ðkÞ ¼ Sche:₁ þ k ð Sche: 2 Sche:1 Þ n :

The transition between these substates is similar to that between the LIGHT, FAIR, and HEAVY states. The state transition diagram and transition scenarios of the broadcast channel are as shown in Fig. 9 by substituting BCast

1 and

BCast2 for Sche:1 and Sche:2 , respectively. The determination of

the values of Sche:

1 , Sche:2 , BCast1 , and BCast2 is described in

Appendix B.

We also define the aggregate state of the scheduler and the broadcast channel as follows: The aggregate state is LIGHT when the loads of the scheduler and the broadcast channel are both LIGHT. The aggregate state is HEAVY when, in at least one of the loads, the scheduler and broadcast channel are HEAVY. Otherwise, the aggregate state is FAIR. In the FAIR state, the current substate is determined to be the heaviest of the current substates (i.e., the heaviest load) of the scheduler and the broadcast channel. For each new-coming data request, the scheduler will decide a suitable version, fill the version information into the data request

(9)

according to the aggregate state, and insert it into the data request queue. The scheduler will also inform the mobile client of the decided version by replying with an acknowl-edgment message.

4.2.3 Candidate Version Selection Phase

Let degradation and maxDegradation indicate the sug-gested and maximal degrees of degradation, respectively. The value of maxDegradation is determined by

maxDegradation¼ max

8Pk;Dj

fBEST ðk; jÞ W ORST ðk; jÞg: In the candidate version selection phase, the server will determine a proper value of degradation according to the state of the server, and versions BEST ðk; jÞ; BEST ðk; jÞ 1; ; BEST ðk; jÞ degradation are called candidate ver-sions. The procedure in candidate version select phase is described below:

. Case 1: The aggregate state is LIGHT. The scheduler operates in the traditional on-demand broadcast mode when the aggregate state is LIGHT. Hence, the server guarantees that each client will receive the best viewable versions of the requested data objects. That is, the system will return the BEST ði; jÞth version of Dj when a user requests Dj by a mobile

device belonging to device profile Pi. Therefore, the

value of degradation is set to zero.

. Case 2: The aggregate state is FAIR. In the FAIR state, the quality of the received data objects may be degraded. Suppose that the FAIR state consists of n substates. Then, the value of degradation is set to dk maxDegradation

nþ1 e when the server is in the

kthsubstate of the FAIR state.

. Case 3: The aggregate state is HEAVY. When the aggregate state is HEAVY, the server will suggest returning the W ORST ði; jÞth version of Dj when a

user requests Dj by a mobile device belonging to

device profile Pi. Therefore, the value of degradation

is set to maxDegradation. 4.2.4 Version Decision Phase

In this phase, the server should pick a proper one from candidate versions, i.e.,

BESTði; jÞ; BEST ði; jÞ 1; ; BEST ði; jÞ degradation: Suppose that the incoming request is for Di. The steps of the

decision are as follows:

. Step 1: In this step, the server checks the data requests in the request queue. If, in the request queue, there is a data request for Di, say Req, with

version v,

BESTði; jÞ v BEST ði; jÞ degradation; version v is selected since this incoming request can be merged into Req without increasing the load of the server. The server will perform Step 2 if there is no such data request in the request queue.

. Step 2: In this step, the server checks the objects stored in the cache. If there is an object DiðvÞ,

BESTði; jÞ v BEST ði; jÞ degradation, stored in cache, version v is selected so that the server need neither retrieve Dv from its data server nor

perform transcoding. Otherwise, the server will perform Step 3 if there is no such object in the cache. . Step 3: Select the version v which is covered by the

most profiles among versions

BESTði; jÞ; BEST ði; jÞ 1; ; BEST ði; jÞ degradation: Although the server load cannot be reduced by this decision, the probability that successive requests can perform request merge will increase.

4.3 Service Admission Control Scheme

The proposed service admission control scheme consists of two phases: the state determination phase and the admis-sion control phase. To perform service admisadmis-sion control, the server first determines the state of the control channel in the state determination phase and then determines whether to grant a service registration or a service handoff in the admission control phase. The procedures of these two phases are described in the following subsections.

4.3.1 State Determination Phase

The proposed service admission control scheme is em-ployed in each service manager to determine whether to grant a service registration or a service handoff by considering the number of users in service, the network status, and so on. The rate that a service registration is blocked is called service blocking rate (abbreviated as SBR), while the rate that a service handoff is forced to terminate is called service dropping rate (abbreviated as SDR). The rationale of our service admission control scheme is to keep the system load of the control channel (i.e., Queue 1 in Fig. 5) smaller than the predetermined thresholds at the cost of increasing SBR and SDR. To achieve this, two thresholds, Ctrl:

1 and

Ctrl:

2 where Ctrl:1 < Ctrl:2 < 1, are specified to divide the

load of the control channel into three states: LIGHT, FAIR, and HEAVY. The state transition diagram and transition scenario of the service manager are shown in Fig. 9 by substituting Ctrl:

1 and Ctrl:2 for Sche:1 and Sche:2 , respectively.

Similarly, the determination of Ctrl:

1 and Ctrl:2 is described

in Appendix B.

4.3.2 Admission Control Phase

Although the proposed version decision policy can reduce the loads of the scheduler and the broadcast channel, the effect of the proposed version decision policy is limited since it depends on several factors such as the locality of data requests, the cache size, and so on. As a consequence, in addition to the load of the control channel, the service admission control scheme should also take the loads of the scheduler and the broadcast channel into consideration. The procedure in the admission control phase is as below.

When the load in the control channel is HEAVY, the server will block all service registrations and drop all service handoffs in order to relieve the server load. When the load of the control channel is FAIR or LIGHT, the server will determine the values of two probabilities, P robBlock

(10)

with probability P robBlock, while a service handoff will

be dropped with probability P robDrop. It is the system

administrators’ responsibility to specify how to determine the values of P robBlockand P robDrop. Let curStateCtrl:be the

current state of the control channel, and let curStateAgg: be

the aggregate state of the scheduler and the broadcast channel. Note that SBR should be sacrificed first since mobile users can tolerate a service registration being blocked rather than a service handoff being forced to terminate (i.e., dropped). Therefore, in each combination of curStateCtrl: and curStateAgg:, P robBlock should be larger

than or equal to P robDrop. An example setting for

determin-ing P robBlock and P robDrop in an environment with three

substates in the FAIR state is given in Table 2.

Consider the case that the server decides to reject a service registration of a service handoff since the server’s load cannot afford it. If the owner of the service registration or the service handoff, say, user i, has the same interest to other users using the service, granting this service registra-tion or the service handoff will not increase the server load since all the user i’s requests are expected to be able to be merged to other users’ requests. Hence, to decrease SBR and SDR, the server should grant user i’s service registration or service handoff. From the above example, we observe that we can aggressively grant a server registration or a service handoff as long as the owner and other users are of common interest.

To measure the similarity of interest of user i and other users using the service, we define the similarity factor as the probability that a user’s request will be merged to another request. When receiving a data request, the server will check whether the data request is merged into another request and update the user’s similarity factor stored in the user’s profile. The system administrators have to specify a threshold , 0 1, so that a service registration or a service handoff will be granted (even the server cannot afford it) as long as the value of the owner’s similarity factor is larger than or equal to .

4.4 Data Indexing

As shown in [18], setting the degree of broadcast programs to a smaller value will make mobile devices meet index segments more quickly, thus reducing energy consumption. However, it is true only in the cases that turning on and turning off WNIs do not consume energy. As pointed out in [24], in reality, turning on and turning off the WNIs consumes some time and energy, and the transition times of a WNI from active mode to doze mode and from doze mode to active mode are both on the order of tens of milliseconds.

Consider two organizations of index and data items shown in Fig. 10. Note that the time periods marked “A”

and “D” indicate the time periods where the mobile device is in active and doze mode, respectively, while the time periods marked “F” and “N” indicate the time periods where the mobile device is turning off and turning on the wireless network interfaces (abbreviated as WNIs). Suppose that a mobile device tunes to the broadcast channel at time tStartand finishes the retrieval of the desired data item

at time tEnd. As observed in Fig. 10, when the value of

degree of broadcast programs decreases, mobile devices will switch back and forth between active and doze modes (i.e., turn on and turn off WNIs) more frequently, and therefore, may consume more energy. As a result, the value of degree of broadcast programs should be set to a proper value to minimize energy consumption of mobile devices.

In view of this, we adopt an adaptive data indexing method [16] which is able to dynamically adjust the degree of broadcast programs according to system workload. The employed data indexing method consists of two phases, the statistics collection phase and the degree adjustment phase, and switches back and forth between these two phases periodically. In the statistics collection phase, the system collects the arrival time, finish time, and other statistical information of each served data request. Then, in the successive degree adjustment phase, the server determines a proper value of degree of broadcast programs according to the collected information. For the interest of space, we omit the description of the determination of the value of degree of broadcast programs. Interested readers can refer to [16] for details.

After determining the current value of degree of broad-cast programs, the server then generates the broadbroad-cast program accordingly. Since the data items are of different sizes, we use the parameter budget, which is defined as the maximal length of the data segments of all buckets, to control the length of each bucket. Initially, the bucket is empty and the scheduler fetches as many data items as possible from the cache under the constraint that the summation of the sizes of the fetched data items is smaller than or equal to budget. In addition, the scheduler marks the fetched data items as LOCKED. Then, the scheduler inserts the corresponding index items in front of these data items. Finally, the scheduler broadcasts the index and data items in the bucket sequentially. An index item or a data item is removed from the bucket once it has been broadcast. In

TABLE 2

An Example Setting for Determining P robBlock and P robDrop

Fig. 10. Example organizations of index and data items. (a) Example broadcast program with degree four. (b) Example broadcast program with degree one.

(11)

addition, the state of a data item which has been broadcast is marked as UNLOCKED. The above procedure repeats until the bucket becomes empty. To employ data indexing, the cache replacement policy should be also modified to consider only data items in UNLOCKED states as the candidates to be replaced.

4.5 Remarks

Currently, the proposed version decision policy and service admission control scheme are designed with the goal of reducing the overall average waiting time and average tuning time. Therefore, if two users submit two data requests (each user submits one request) for the same data object at the same time, their priorities and version numbers will be the same.

It is possible to implement differentiated QoS control in the proposed architecture. For example, we can add a classifier in front of the scheduler to classify the received data requests according to some administrator-specified rules. Hence, the version decision policy is able to assign their version numbers according to their classes. In addition, when processing a service registration or a service handoff, the server first classifies the service according to the user’s profile and then takes action according to the user’s class. Consider the case that the server receives two service registrations. Suppose that one is submitted by a VIP user and the other is submitted by a normal user. The latter will be rejected if the server can accept only one service registration.

5 P

ERFORMANCE

E

VALUATION

To evaluate the performance of scheme ODB-QoS-Index, we build an event-driven simulator with SIM [5]. In order to measure the reduction of power consumption of scheme ODB-QoS-Index, we also implement scheme ODB-QoS, which only employs the proposed version decision policy and service admission control scheme. Both scheme ODB-QoS-Index and scheme ODB-QoS are executed periodically with period two minutes and the simulation is run for 12 hours. Scheme CS (standing for traditional Client-Server) and scheme ODB (standing for On-Demand Broadcasting) are also implemented for comparison purposes. The average access time and tuning time are employed as the

performance metrics of experiments. In addition, the average value of degradation, SBR, and SDR, are taken as the metrics of the cost of scheme ODB-QoS-Index. The average value of degradation is used to measure the degree of quality degradation of the received data objects.

5.1 Simulation Model

We set the cell topology as a 4 4 cells wrapped around mesh topology. Scheme AE [8] is employed as the cache replacement policy since it outperforms the other replace-ment policies for transcoding proxies. Each cell provides one control channel and one download channel with network bandwidth 10 KByte/sec and 100 KByte/sec, respectively. Analogous to [8], we assume that there are 4,000 data objects and the sizes follow a lognormal distribution with a mean of 18 KBytes. The sizes of a control message (e.g., data request message and acknowledgment message) and an index item are both set to be 1 KByte. The access probabilities of data objects are assumed to follow a Zipf distribution, which is widely adopted as a model for real Web traces [6]. The parameter of the Zipf distribution is set to be 1.1 with a reference to the analyses of real Web traces [6]. Since small objects are much more frequently accessed than large ones [11], we assume that there is a negative correlation between the object size and its access probability [8]. The default capacity of the cache is set to be “0:01 Pobject size” and the fetch delays of data objects follow an exponential distribution with mean two seconds [8]. The values of W1and W2(i.e., the QoS requirement) are

set to be six seconds and 15 seconds, respectively.

In the client model, as in [7] and [8], we assume that the mobile clients are classified into five device profiles, and the distribution of these five device profiles is modeled as a device vector of h15%; 20%; 30%; 20%; 15%i. Without loss of generality, we also assume that all objects could be transcoded into 10 versions and the sizes of the 10 versions (from version 1 to version 10) are assumed to be 10 percent, 20 percent, 30 percent, , and 100 percent of the original object sizes [8]. The viewable version set of each device profile is shown in Table 4. By a reference to [8], we assume that a more detailed version can be transcoded into a less detailed one and the transcoding delay is determined as the quotient of the object size to the transcoding rate. The transcoding rate is set to be 30 KBytes/sec [7]. The number of users in the network is set to be 1,000. The cell residence time, service holding time, and service establishing time for each user are set to be exponential distributions with means of 40 minutes, 15 minutes, and one hour, respectively. We also assume that the data requests of each user follow a Possion process with parameter 1

¼ 60 seconds. The values

of parameters used are listed in Table 3 for better readability.

TABLE 4

Device Profiles and Viewable Version Sets TABLE 3

(12)

5.2 The Effects of Cache Size

In this experiment, we investigate the effect of varied cache size in average waiting time, average tuning time, SBR, SDR, and average value of degradation. Fig. 11 shows the experimental results with the cache size varied. The cache size is set to be CacheSizeRatio P object size. The value of CacheSizeRatio ranges from 0.001 to 0.1. As shown in Fig. 11a, the average waiting time of all schemes decreases as the value of CacheSizeRatio increases. This is because the cache with large size is able to effectively reduce the average waiting time by storing data objects with high access probabilities.

Consider the average waiting time of scheme ODB and scheme CS. The average waiting time reduction of scheme ODB over scheme CS increases from 30 percent to 60 percent as the value of CacheSizeRatio decreases from 0.1 to 0.001. Since scheme ODB can effectively reduce the number of requests from the cache’s perspective by request merge, the system load of scheme ODB is lighter than that of scheme CS. Hence, scheme ODB outperforms scheme CS especially when the cache size is small (i.e., high system load). Although scheme ODB can minimize average waiting time, the performance of scheme ODB does not satisfy the system administrators’ expectation since the average waiting time is larger than the value of W2.

To fulfill the system administrators’ requirement when system load is high, scheme QoS and scheme ODB-QoS-Index will reduce the quality of the requested data objects and reject some service registrations and service handoffs. Reducing the quality of the requested data objects will increase the probabilities of request merge and, hence, reduce the number of data requests from the cache’s perspective. In addition, when the system load is still high, scheme ODB-QoS and scheme ODB-QoS-Index will block service registrations to limit the number of users in service. If blocking service registrations still cannot reduce the average waiting time to the administrators’ requirement, the service manager will then reject service handoffs. As shown in Fig. 11a, the average waiting time of scheme ODB-QoS and scheme ODB-ODB-QoS-Index is still smaller than W2as

the value of CacheSizeRatio decreases. This result shows that scheme ODB-QoS and scheme ODB-QoS-Index are able to control the average waiting time to satisfy the specified QoS requirement. In addition, since scheme ODB-QoS-Index inserts index items into the broadcast program, the average waiting time of scheme ODB-QoS-Index is longer

than that of scheme ODB-QoS. Due to the small size of index items, the increment on average waiting time of scheme ODB-QoS-Index over scheme ODB-QoS is quite small (around 5 percent in this experiment).

Fig. 11b shows the average tuning time of all schemes. Without employing data indexing, the average tuning time and the average waiting time of all schemes except scheme ODB-QoS-Index are the same. In scheme ODB-QoS-Index, when the current bucket does not contain the desired data items, mobile clients can go to doze mode to save power consumption and wake up on the starting point of the next bucket. Therefore, as shown in Fig. 11b, scheme ODB-QoS-Index is able to greatly reduce the tuning time (around 93 percent in this experiment), showing the advantage of data indexing.

Although scheme ODB-QoS and scheme ODB-QoS-Index outperform scheme ODB and scheme CS, scheme ODB-QoS and scheme ODB-QoS-Index produce overhead in SBR, SDR, and the degradation on the quality of received data items. Fig. 11c and Fig. 11d show the degradation on the quality of received data items and the produced SBR and SDR, respectively, of scheme ODB-QoS and scheme lODB-QoS-Index with the value of CacheSizeRatio varied. The SBR and SDR produced by scheme CS and scheme ODB are omitted in this and the following experiments since both schemes always grant service registrations and service handoffs (i.e., both SBR and SDR are always zero).

W h e n t h e c a c h e s i z e i s l a r g e e n o u g h ( i . e . , CacheSizeRatio 0:03 in this experiment), most hot data items are cached and the average waiting time is under the predetermined threshold. Hence, the average value of degradation is around 0.6 and the quality of the received data items is quite good. In the same condition, SDR is equal to zero and SBR is only a little bit larger than zero. When the cache size becomes small (CacheSizeRatio ¼ 0:01 in this experiment), the average value of degradation increases significantly to keep the average waiting time between the predetermined thresholds. When the cache size becomes smaller (CacheSizeRatio 0:003 in this experi-ment), only increasing the value of degradation is not able to effectively relieve the increase of the average waiting time. Hence, the system will block some service registrations to keep the average waiting time under the predetermined threshold. Service registrations are rejected before service handoffs since users can tolerate a service registration to be blocked rather than a service handoff to be dropped. When

(13)

the value of CacheSizeRatio is very small, some service handoffs are dropped since only blocking service registra-tions is not able to keep the average waiting time under the threshold. With the above mechanisms, scheme ODB-QoS and scheme ODB-QoS-Index are able to keep the average waiting time in the predetermined range.

5.3 The Effects of the Number of Users

Fig. 12 shows the experimental results with the number of users varied. The number of users is set from 400 to 1,400. From Fig. 12a, we observe that when the number of users is small (400 in this experiment), the system load is light and the average waiting times of all schemes are close. When the number of users increases, the average waiting time of scheme CS and scheme ODB also increases. In addition, the increment of the average waiting time of scheme CS and scheme ODB increases as the number of users increases, especially when the number of users is larger than 1,200. Since a large number of users implies high arrival frequen-cies of data requests, the system load becomes heavy and the average waiting time increases drastically. In this experi-ment, when the number of users is 1,400, the average waiting time of scheme CS does not converge as the time advances since the system load is larger than one. This situation agrees to the observation in Section 4.2. This experimental result also shows that the average waiting time reduction of scheme ODB over scheme CS increases from 47.11 percent to 74.2 percent as the number of users increases from 400 to 1,400. Scheme ODB is more scalable than scheme CS due to the employment of on-demand data broadcast.

Consider scheme ODB-QoS and scheme ODB-QoS-Index. When the number of users is small (400 in this experiment), scheme ODB, scheme ODB-QoS, and scheme ODB-QoS-Index have similar behavior. This can be explained by the reason that, when the average waiting time of scheme ODB-QoS is smaller than W1, scheme ODB-QoS is degenerated to

scheme ODB and guarantees that each user will receive the best viewable versions of the requested data objects. In addition, although inserting some index items into the broadcast program, scheme ODB-QoS-Index is still able to perform almost as well as scheme ODB-QoS since the size of index items is much smaller than that of data items. In addition, as shown in Fig. 12b, employing data indexing is able to greatly reduce the average tuning time. In this experiment, the tuning time reduction of scheme ODB-QoS-Index over scheme ODB-QoS is around 92 percent.

As shown in Fig. 12c, when the number of users increases to 800, the average value of degradation increases in order to keep the average waiting time satisfying the QoS requirement. As shown in Fig. 12d, when the number of users increases to 1,000, the system blocks some service registrations to satisfy the QoS requirement (i.e., in the interval ðW1; W2Þ). Similarly, some service handoffs are

dropped when the number of users is larger than 1,200. By controlling the quality of received data objects and the number of users in service, scheme ODB-QoS and scheme ODB-QoS-Index are able to keep the average waiting time satisfying the QoS requirement even when the offered system load is heavy.

5.4 The Effects of Skewness of Access Probabilities Fig. 13 shows the experimental results with the skewness of access probabilities varied. The degree of skewness is measured by the value of the Zipf parameter which is set from 1 to 1.4 in this experiment. The larger the Zipf parameter is, the higher the degree of skewness is. As shown in Fig. 13a, the average waiting time of all schemes increases as the value of Zipf parameters decreases. It is because the degree of request locality is high when the access frequencies is highly skewed (i.e., large Zipf parameter). Therefore, with the same cache size, the cache hit ratio is high and is able to effectively reduce the average access time. Moreover, on-demand data broadcasting-based schemes (i.e., scheme ODB, scheme ODB-QoS, and scheme ODB-QoS-Index) outperform scheme CS in average waiting time since they take advantage of the locality of data requests by request merge. We also observe that the increment of the average waiting time of scheme CS and scheme ODB increases drastically when the value of the Zipf parameter decreases (i.e., one in this experiment). The reason is that the effect of cache and request merge decreases as the degree of skewness decreases. Hence, the system load becomes heavy when the degree of skewness is low and, therefore, the increment of the average waiting time increases. This result conforms to the observation in Section 4.2. In this experiment, the average waiting time reduction of scheme ODB over scheme CS ranges from 36.9 percent to 65 percent. In addition, as shown in Fig. 13b, employing data indexing is able to greatly reduce the average tuning time. In this experiment, the tuning time reduction of scheme QoS-Index over scheme ODB-QoS is around 90 percent.

(14)

As shown in Fig. 13c, the average value of degradation is small when access probabilities are highly skewed. We also observe from Fig. 13d that, when the skewness of access frequencies is high (Zipf parameter ¼ 1:4 in this experiment), scheme ODB-QoS is degenerated to scheme ODB since the average waiting time of scheme ODB is smaller than W1.

When the access probabilities are not skewed enough, the system cannot fulfill the QoS requirement and will increase the value of degradation. When the Zipf parameter is around 1.2, some service registrations are blocked (SBR > 0) to satisfy the QoS requirement. Moreover, when the Zipf parameter is smaller than 1.1, some service handoffs are also dropped. With the above mechanisms, scheme ODB-QoS and scheme ODB-ODB-QoS-Index are able to keep the average waiting time in the predetermined range.

6 C

ONCLUSION

We explored in this paper the effect of an on-demand broadcasting technique in the design of a QoS-aware and energy-conserving transcoding proxy. We first proposed a QoS-aware and energy-conserving transcoding proxy ar-chitecture, QETP, and modeled it as a queuing network. By analyzing the queuing network, several theoretical results were derived to formulate the system average waiting time. We then proposed a version decision policy and a service admission control scheme to provide QoS in QETP. The derived results were used to guide the execution of the proposed version decision policy and service admission control scheme to fulfill the given QoS requirement. In addition, we also proposed a data indexing method to reduce the power consumption of clients. To measure the performance of QETP, several experiments were con-ducted. Experimental results showed that the proposed scheme is more scalable than traditional client-server systems and can effectively achieve the desired QoS. In addition, the proposed scheme was able to greatly reduce power consumption of clients at the cost of a slight increase in average access time.

A

PPENDIX

A

VERAGE

W

AITING

T

IME

E

STIMATION

Although the system average waiting time can be formu-lated by (6) and Lemmas 1, 2, and 3, not all components can be directly obtained in practice since Queue 1 and Queue 3

are logical queues. To overcome this problem, we propose an approximation method for each unavailable parameter to estimate the system average waiting time.

Consider the queuing network shown in Fig. 5. The input process of Queue 1 cannot be directly observed by the transcoding proxy. However, since the control channel (i.e., Queue 1) is an M/M/1 queue, the output process of Queue 1 is identical to the input process2of the correspond-ing scheduler. Hence, the input process of Queue 1 can be observed by the scheduler, and the average waiting time of the control channel can be obtained by (1). In addition, since the average and variance of the service time of Queue 2 can be observed by the scheduler, the average waiting time of the scheduler can be derived by (2).

To derive the average waiting time of the broadcast channel (i.e., Queue 3), the cumulative distribution function of interarrival time of the input process (i.e., AðtÞ) is required. However, deriving exact AðtÞ is impractical since AðtÞ is continuous. Hence, we adopt the following approach to estimate AðtÞ: Consider the mth execution of scheme ODB-QoS-Index. The average and the variance of the interarrival time of Queue 3 between the ðm 1Þth and mth executions (i.e., 1

BCast and

2

BCast, respectively) can be obtained. We then

partition the interarrival time into the following k intervals: I1¼ 1 BCast k 2 2 BCast; 1 BCast k 2 BCast ; Ik1 2 ¼ 1 BCast 1 2 BCast; 1 BCast 3 2 BCast ; Ikþ1 2 ¼ 1 BCast 1 2 BCast; 1 BCast þ1 2 BCast ; Ikþ3 2 ¼ 1 BCast þ1 2 BCast; 1 BCast þ3 2 BCast ; Ik¼ 1 BCast þk 2 2 BCast; 1 BCast þk 2 BCast ; where k is a positive odd number and k > 1. Note that, although indicating the higher accuracy of the estimation of AðtÞ, a larger k also implies larger memory consumption. We

Fig. 13. The effects of the Zipf parameters. (a) Average waiting time. (b) Average tuning time. (c) Average degradation. (d) SBR/SDR.