A QoS-Aware and Energy-Conserving Transcoding Proxy Using On-demand Data Broadcasting
Jiun-Long Huang Ming-Syan Chen, Fellow, IEEE Department of Computer Science Department of Electrical Engineering
National Chiao Tung University National Taiwan University Hsinchu, Taiwan, ROC Taipei, Taiwan, ROC E-mail: [email protected] E-mail: [email protected]
Abstract
Most research works in transcoding proxies in mobile computing environments are on the ba- sis of the traditional client-server architecture and do not employ the data broadcast technique. In addition, the issues of QoS provision and energy conservation are also not addressed in the prior studies. In view of this, we design in this paper a QoS-aware and energy-conserving transcod- ing proxy by utilizing the on-demand broadcasting technique. We first propose a QoS-aware and energy-conserving transcoding proxy architecture, abbreviated as QETP, and model it as a queue- ing network consisting of three queues. By analyzing the queueing network, three lemmas are derived to estimate the load these queues. We then propose a version decision policy and a service admission control scheme to provide QoS in QETP. The derived lemmas are used to guide the exe- cution of the proposed version decision policy and service admission control scheme to achieve the given QoS requirement. In addition, we also propose a data indexing method to reduce power con- sumption of clients. To measure the performance of the proposed architecture, three experiments are conducted. Experimental results show that the average access time reduction of the proposed scheme over traditional client-server architecture ranges from 45% to 75%. Experimental results also show that the proposed scheme is more scalable than traditional client-server architecture and is able to effectively control the system load to attain the given QoS requirements. In addition, the proposed scheme is able to greatly reduce average tuning time of clients at the cost of a slight increase (around 5% in our experiments) in average access time.
Key words: Transcoding proxy, QoS, energy-conservation, data broadcast, on-demand broadcast
1 Introduction
In a pervasive computing environment, due to the constraints resulting from power-limited mobile de- vices and low-bandwidth wireless networks, designing a power conserving mobile information system with high scalability and high bandwidth utilization becomes an important research issue, and hence attracts a significant amount of research attention. In addition, the high diversity in the capabilities of various mobile devices such as display capabilities (e.g., screen size, color depth and supported data formats) and computation power makes the design of mobile information systems more challenging.
This diversity also results in an increasing demand on the capability of context awareness for mobile information systems.
Content adaptation, which is an important technique to realize context awareness, emerges to rem- edy the problem resulting from the said diversity by offering different mobile users suitable versions of the same object according to the capabilities of the mobile devices, the traffic of the networks and the users’ preferences [20]. Transcoding, which transforms a data object from one version into another, is recognized as a promising technique to realize content adaptation [20][21][23]. A proxy capable of transcoding (referred to as a transcoding proxy) is placed between a client and an information server to coordinate the mismatch between what the server provides and what the client prefers. Since proxy- based approaches are transparent to the content providers and users, this kind of approach is able to simplify the design of servers and clients, and as a result, attracts much research attention.
In recent years, data broadcast [2][3][29] has been employed as an important technique to design a scalable and power conserving mobile information system. However, most research works in transcod- ing proxies in mobile computing environments are on the basis of the traditional client-server architec- ture and do not employ the data broadcast technique. Hence, the transcoding proxies are not scalable and the network bandwidth is not well utilized. In addition, most prior studies do not consider the issue of quality of service (abbreviated as QoS) which is crucial in a mobile computing environment.
In addition, as shown in [26], only a modest improvement (20% ∼ 30%) in battery lifetime is ex- pected in the next few years. Hence, energy conservation is raised as a key factor of the design of mobile devices. Since data indexing is recognized as a promising means to reduce power consumption [17], many researchers have studied the design of data indexing algorithms in push-based data broad-
casting environments [9][22][28][30]. However, most studies on on-demand data broadcasting focus on the design of scheduling algorithms [1][3], and only a few of them consider the employment of data indexing in on-demand data broadcasting environments [18].
In view of this, we design in this paper a scalable, QoS-aware and energy-conserving transcoding proxy by utilizing the on-demand broadcasting technique. Explicitly, we first propose a QoS-aware and energy-conserving transcoding proxy architecture, abbreviated as QETP, and model it as a queueing network with three queues. By analyzing the queueing network, three lemmas are derived to formulate the average waiting time of these queues. We then devise scheme ODB-QoS-Index to provide QoS in QETP where ODB-QoS-Index stands for “On-demand Data Broadcasting with QoS and data Indexing.”
Scheme ODB-QoS-Index is an online, iterative and adaptive algorithm comprising
1. a version decision policy to determine the suitable version for each data request according to the users’ device profiles and the state of the server,
2. a service admission control scheme to determine whether to grant a service registration or a service handoff according to the state of the server, and
3. a data indexing method to insert data indices into the broadcast program to reduce power con- sumption of clients.
In each iteration, scheme ODB-QoS-Index estimates the average waiting time of each queue based on the derived results, determines the state of each queue according to the corresponding estimation of average waiting time, and configures the behavior of the version decision policy and the service admis- sion control scheme in accordance with the states of these queues to attain the desired QoS. In addition, scheme ODB-QoS-Index inserts index items into the broadcast program to reduce the clients’ power consumption. To measure the performance of QETP, three experiments are conducted. Experimental results show that the average access time reduction of the proposed scheme over traditional client-server architecture ranges from 45% to 75%. Experimental results also show that scheme ODB-QoS-Index is more scalable than traditional client-server architecture, and is able to achieve the system admin- istrators’ QoS requirements by the devised version decision policy and the service admission control scheme. In addition, scheme ODB-QoS-Index is able to greatly reduce average tuning time at the cost
Mobile Information System
Data Request Queue
Data Request Data
Object
Notebook PDA Tablet PC
Figure 1: An example on-demand broadcast- ing system
D1 D2 D3 D4 D5
A Time
Broadcast Program
t
(a) Without data indexing
I1
D1 D2 D3 I2 D4 D5
A A A
D D Time
Broadcast Program
t
(b) With data indexing
Figure 2: Employment of data indexing
of a slight increase (around 5% in our experiments) in average access time. Access time is defined as the summation of time periods from the moment that mobile clients submit data requests to the moment that mobile clients receive the requested data items. On the other hand, tuning time is defined as the summation of time periods that mobile clients operate in active mode. Access time is widely used to evaluate the efficiency of broadcast systems, while tuning time is used to evaluate power consumption of mobile devices. To the best of our knowledge, there is no prior research on the design of transcoding proxies employing data broadcast. This feature distinguishes this paper from others.
The rest of this paper is organized as follows. The descriptions of related work and the proposed transcoding proxy architecture, QETP, are given in Section 2. An analytical model and a transcoding model are devised in Section 3. Then, Section 4 describes the proposed version decision policy, ser- vice admission control scheme and data indexing method. The performance evaluation is shown in Section 5, and finally, Section 6 concludes this paper.
2 Preliminaries
2.1 On-demand Data Broadcasting
Figure 1 shows an example on-demand broadcasting system. In an on-demand data broadcasting system [1][3][4], a server maintains a data request queue and serves these requests according to the employed scheduling algorithm. When requiring one data item, a mobile client sends a data request to the server.
After receiving a data request, the server first checks whether there exists another data request in the data request queue with the same required data object. If yes, the new-coming data request is merged into that data request. This phenomenon is called request merge. Data requests with the same requested data object can be safely merged since one transmission of the data object in a broadcast channel is able to serve all merged data requests. Therefore, the higher the occurrence probability of request merge is, the more efficient the system is. Otherwise, the new-coming data request is inserted into the data request queue.
A scheduling algorithm is used to prioritize all data requests in the data request queue, and the server will serve these data requests according to their priorities. To serve a data request, the system retrieves the required data object from the corresponding data server, and then broadcasts this object to all its clients via a dedicated and shared broadcast channel. As a result, the on-demand broadcast system is more scalable and can obtain higher network utilization than traditional client-server architecture.
2.2 Related Work
2.2.1 Prior Work Related to On-demand Data Broadcasting
Dykeman et al. pointed out in [10] that traditional FCFS scheduling would produce long average ac- cess time for an on-demand broadcast system when the access frequencies of all data items were not uniformly distributed. They proposed several scheduling algorithms and concluded that LWF could provide the best performance among the proposed algorithms. Aksoy et al. pointed out in [3] that although being able to produce the shortest average access time, LWF is not efficient when the num- ber of data requests is large. To address this problem, they proposed algorithm RxW which is able to schedule the received data requests efficiently by employing a pruning technique. Experimental results showed that the performance (i.e., average access time) of RxW is close to that of LWF. Unfortunately, the algorithm RxW is designed under the premise that each data item is of the same size. Hence, it is not suitable for variable-sized data items. In [1], Acharya et al. addressed the broadcast scheduling problem in the environments with variable-size data items. They defined a new metric, stretch, as the ratio of the response time of a request to its service time. Based on stretch, they proposed a scheduling algorithm, called LTSF, to minimize the stretch. Wu et al. argued that algorithm LTSF is not optimal
Ik(1) Ik(2) Ik(d) Dk(1) Dk(2) Dk(d)
ISk DSk
Bucketk
Figure 3: Index structure
in terms of overall stretch [27]. In addition, algorithm LTSF is not scalable in a large-scale environ- ment. Therefore, they proposed a scheduling algorithm to optimize the system performance in terms of stretch. Moreover, the proposed scheduling algorithm is more scalable than LTSF, and hence, is suitable for practical use.
However, most studies on on-demand data broadcasting focus on the design of scheduling algo- rithms [1][3], and only a few of them consider the employment of data indexing in on-demand data broadcasting environments [18]. Figure 2a and Figure 2b show the examples that a mobile client is- sues a data request at time t on broadcast programs without and with data indexing, respectively. In Figure 2a and Figure 2b, the time periods marked as ‘A’ and ‘D’ indicate that the time periods that the mobile device is in active and doze mode, respectively. Since the sizes of index items are much smaller than those of data items, employing data indexing is able to greatly reduce the average tuning time at the cost of a slight increase in the average access time.
In [18], Lee et al. proposed a data indexing method in an on-demand data broadcasting environment.
As shown in Figure 3, the proposed broadcast program is partitioned into a series of buckets and each bucket contains an index segment and a data segment. The number of the index items in an index segment is equal to the number of data items in the corresponding data segment in the same bucket.
In bucket Bk, the i-th index item (i.e., Ik(i)) contains (1) the identifier and the version number of the corresponding data item in bucket Bk (i.e., Dk(i)), (2) the time offset that Dk(i) will be broadcast and (3) the size of Dk(i). The number of index items within an index segment is called the degree of the broadcast program. In [18], the degree of all buckets are the fixed, and the experimental results suggest to set degree of broadcast programs to two for better performance.
2.2.2 Prior Work Related to Transcoding Proxy
Han et al. proposed in [13] an image transcoding proxy which is able to control the data retrieval time to meet users’ requirements. The proposed transcoding proxy can adaptively adjust the sizes of the objects transmitted to users by using an aggressive lossy compression method. They also presented an analytical framework for determining whether to transcode and how much to transcode an image, and a process used by the transcoding proxy to adapt its image coding to meet an upper bound on the delay tolerated by the end user.
In [7], Cardellini et al. analyzed how network proxies can work collaboratively in content transcod- ing and caching. They proposed a distributed algorithm to distribute the computation load caused by transcoding throughout a collaborative proxy system. They also proposed two extended strategies to cache data objects. In [8], Chang et al. explored the aggregate effect when caching multiple versions of the same Web object in the transcoding proxy. They argued that the aggregate profit of caching multiple versions of an object is not simply equal to the sum of the profits of caching individual versions, but rather, depends on the transcoding relationships among them. They devised the notion of a weighted transcoding graph and formulated a generalized profit function. Based on the weighted transcoding graph and the generalized profit function, an innovative cache replacement algorithm for transcoding proxies was proposed, and the proposed cache replacement algorithm was shown to perform well in terms of the delay saving ratios and cache hit ratios.
Hsiao et al. proposed the architecture of versatile transcoding proxy in [14]. Based on the con- cept of the agent system, the proposed architecture can accept and execute the transcoding preference script provided by the client or the server to transform the corresponding data or protocol according to the user’s specification. Fine granularity control is achieved by building a weighted transcoding graph which depicts the transcoding relationship among transcodable versions dynamically. Based on the weighted transcoding graph, the transcoding proxy performs cache replacement according to the content in the caching candidate set, which is generated by the concept of dynamic programming.
In the early study [15] of this paper, we proposed a QoS-aware transcoding proxy architecture to use on-demand broadcast to transmit the requested data objects. However, the issue of energy conservation is not considered. Therefore, for energy conservation, we in this paper extend the prior architecture to
Storage
Scheduler Service
Manager
Cache Manager Internet
Server
Server
Cell Cell
Service Area Back End
Transcoder
Service
Manager Scheduler
Front End Front End
Figure 4: The architecture of QETP
support data indexing techniques. In addition, we also revise the version decision policy and the service admission control scheme proposed in [15] for better performance.
2.3 System Architecture
Figure 4 shows the proposed architecture of QETP. In a cellular environment, the whole service area of a mobile environment is divided into a number of cells. Two dedicated channels, one control channel and one broadcast channel, are provided in each cell. A control channel is used to transmit control mes- sages such as registration messages, data requests, acknowledgements, and so on. On the other hand, a broadcast channel is used by the transcoding proxy to disseminate data objects to its clients. In ac- cording to the locations of these components, QETP comprises the following two types of components:
front-end and back-end.
A front-end, which comprises a service manager and a scheduler, is allocated to each cell. These two components are described below.
• Service Manager: A service manager is in charge of all service-related operations such as service registration, service termination, service admission control and so on. Each service manager owns a profile database storing the users’ profiles and the profiles of these users’ devices.
• Scheduler: A scheduler is a software component which handles the data requests of the corre- sponding cell. After receiving a data request, the scheduler will first determine a suitable version for this data request according to the user’s device profile and the network state. Then, the sched- uler will check whether the received data request can be merged to an existing data request in the data request queue. Different from the traditional on-demand broadcasting architecture de- scribed in Section 2.1, request merge occurs only when there exists another data request in the data request queue asking for the same version of the same required data object of the received data request. Otherwise, the scheduler will insert the received data request into the data request queue.
In addition, a scheduling algorithm is employed to determine the service order of the data requests in the data request queue. While serving a data request, the scheduler will send this request to the cache manager and the cache manager will respond with the content of the required data object. The scheduler then broadcasts the required data object via the broadcast channel, and serves the next data request in the data request queue. Moreover, scheduler will broadcast index items through the broadcast channel to reduce the power consumption of mobile clients.
A back-end, which comprises a cache manager and a transcoder, behaves like a traditional transcod- ing proxy. These two components are described below.
• Cache Manager: After receiving a data request from a scheduler, the cache manager is responsi- ble for returning the required version of the required data object to the scheduler. Suppose that the cache manager receives a data request of the j-th version of data object D(i). If the j-th version of Di is cached, the cache manager will return the cached data object to the scheduler immediately. If the j-th version of Diis not cached, the cache manager will check whether there exists another version of Diwhich can be transcoded into the j-th version of Di. If yes, the cache manager will ask the transcoder to generate the j-th version of Di. Otherwise, the cache man- ager will request the original version of the requested data object from the data server, ask the transcoder to transform the returned data object into the required version, and then transmit the result of transcoding to the scheduler.
• Transcoder: A transcoder is in charge of the transformation of data objects among different versions according to the received transformation requests generated by the cache manager.
Since the design of the back-end is similar to the systems proposed in some prior works [7][8][13][25], we focus in this paper on the design of the front-end.
3 Analytical and Transcoding Models
3.1 Analytical Model
In this subsection, we derive the worst case of the average access time1 of QETP, and use the derived results to propose a version decision policy and a service admission control scheme in Section 4. To facilitate the following discussion, we first make the following assumptions.
1. The employed scheduling scheme of the scheduler is FCFS (standing for first come, first serve).
2. No request merge occurs in the data request queue of the scheduler.
3. One transmission of a data object in the broadcast channel is received by exactly one client.
4. The messages of registration, de-registration and handoff are negligible.
Assumptions 2 and 3 occur when the users’ interests are highly diverse, and hence the effect of on- demand broadcast diminishes. We make these two assumptions since we focus on the worst case of the transcoding proxy. Assumption 4 is made since we focus on the situation that the number of data requests is much higher than the number of control messages (i.e., registration, de-registration, handoff and service termination). These assumptions will be relaxed in our simulation model. For better readability, a list of used symbols is shown in Table 1.
We model QETP as a queueing network as shown in Figure 5. Queue 2 is a physical queue which is located in the scheduler. On the contrary, Queue 1 and Queue 3 are logical queues which are only used to model the control and broadcast channels in order to derive the average waiting time of a data request on the control and broadcast channels, respectively. Suppose that the data requests submitted
1In this paper we use access time and waiting time exchangeably.
Symbol Description Pi i-th device profile
Dj(k) k-th version of data item Dj NU ser Number of users in the cell λCtrl. Aggregate request rate in the cell µCtrl. Service rate of the control channel µSche. Service rate of the cache
µBCast. Service rate of the broadcast channel
ρSche. Standard deviation of the service time of the cache BCtrl. Bandwidth of the control channel
BBCast Bandwidth of the broadcast channel
Table 1: Description of symbols
M/M/1 M/G/1
G/M/1
Control Channel
Scheduler
BroadcastChannel
Data Request
Data Request Data
Data
Queue 1 Queue 2
Queue 3
ACK ACK
Figure 5: The analytical model of the proposed transcoding proxy
by a mobile user i follow a Poisson process with rateλi, and NU ser is the number of mobile users in the cell. To facilitate the following discussion, we number the mobile users in the cell as user 1, 2,
· · ·, NU ser. Due to the characteristic of the Poisson process, the aggregate data requests of all mobile users in the cell follow a Poisson process with rateλCtrl.=∑Ni=1U serλi. Denote the sizes of data requests and request acknowledgements as sCtrl. and sAck., respectively. Also let BCtrl. be the bandwidth of the control channel, and let the waiting time of the control channel for a data request (denoted as WCtrl.) be the time interval between the user sending a data request and the user receiving the acknowledgement.
Then, we have the following lemma.
Lemma 1: The average waiting time of the control channel is
WCtrl.= 1
BCtrl.
sCtrl.+sAck.−λCtrl.
.
Proof: Similar to [19], we assume that the average waiting time to transmit a data request and a request acknowledgement by the control channel is an exponential distribution with mean µ1
Ctrl.. Hence, the control channel can be modeled as an M/M/1 queue. Then, the average service rate of the control channel is
µCtrl.= BCtrl.
sCtrl.+ sAck..
Omitting the equation manipulation which can be found in [12], the approximated average waiting time for each mobile device from submitting a data request to receiving the corresponding request acknowledgement is
WCtrl.= 1
µCtrl.−λCtrl.
= 1
BCtrl.
sCtrl.+sAck.−λCtrl.
. (1)
Q.E.D.
Let the waiting time of the scheduler for a data request (denoted as WSche.) be, from the scheduler’s perspective, the time interval from the arrival of the data request to the time that the requested data object has been obtained. Since the service time of a cache manager is affected by several factors such as cache status of the required data objects, the employed replacement scheme, the characteristics of the input jobs, and so on, the service time of the cache manager cannot be modeled by a particular mathematical distribution. Therefore, we model the average service time of the cache manager as an arbitrary distribution with mean µ1
Sche. and varianceσSche.2 . LetρSche.=µλCtrl.
Sche. be the load of the scheduler.
We then have the following lemma.
Lemma 2: The average waiting time of the scheduler is
WSche.= 1 µSche.
+
ρSche.
µSche.+λCtrl.σSche.2 2(1 −ρSche.) .
Proof: With assumptions 1, 2 and the characteristic of M/M/1 queues, the input process seen by the data request queue of the scheduler is also a Poisson process with rate λCtrl.. When receiving a data
request, the scheduler determines the most suitable version of the requested data object according to the profile of the mobile device and network status, and then inserts the corresponding job (including data object id and the most suitable version number) into the data request queue. To serve a data request, the scheduler passes the job to the cache manager, and the cache manager will retrieve the specified version of the data object requested by the job and return the retrieved data object to the scheduler. Then, the scheduler disseminates the returned data object to its clients via the broadcast channel.
With assumption 2, the processing of the scheduler can be modeled as an M/G/1 queue. Then, as shown in [12], the expected system size in steady-state is
LSche.=ρSche.+ρSche.2 +λCtrl.2 σSche.2 2(1 −ρSche.) .
By Little’s formula, the average waiting time of this queue is
WSche.=LSche.
λCtrl.
= 1
µSche.
+
ρSche.
µSche. +λCtrl.σSche.2
2(1 −ρSche.) . (2)
Q.E.D.
Let the waiting time of the broadcast channel for a data request be the time interval from the time that the requested data object has been obtained by the scheduler to the time that the user has received it. Then, we have the following lemma.
Lemma 3: The average waiting time of the broadcast channel is
WBCast = 1
µBCast(1 − r0),
where r0is the root of z = A∗[µBCast(1 − z)] with value larger than zero and less than one.
Proof: Similar to the proof of Lemma 1, we assume that the average waiting time of the broadcast channel follows an exponential distribution with mean µ 1
BCast. Since the broadcast channel is a dedicated downlink channel, similar as [19], we have
1 µBCast
=Average size of the incoming data objects BBCast
. (3)
As shown in Figure 5, the input process of the broadcast channel is the output process of the scheduler.
Since the service time of the scheduler (i.e., Queue 2) is an arbitrary distribution, the output process of the scheduler does not follow a particular mathematical distribution. Suppose that the interarrival time of the input process follows an arbitrary distribution with cumulative distribution function A(t). The broadcast channel can be modeled as a G/M/1 queue. Let A∗(z) be the Laplace-Stieltjes transform of A(t). Omitting the mathematical manipulation which can be found in [12], the average waiting time of the broadcast channel (denoted as WBCast) is
WBCast = 1
µBCast(1 − r0), (4)
where r0is the root of the following equation with value larger than zero and less than one.
z = A∗[µBCast(1 − z)] (5)
Q.E.D.
Finally, the average waiting time of the whole system (denoted as WSys.) is equal to the summation of the average waiting time of the control channel, the scheduler and the broadcast channel. Then, with Lemmas (1), (2) and (3), WSys.can be formulated as
WSys. = WCtrl.+WSche.+WBCast (6)
3.2 Transcoding Model
Suppose that the mobile devices are classified into several categories based on their capabilities, and the capabilities of each category are described by one device profile. Let Pi be the i-th device profile.
Without loss of generality, we order the device profiles according to their capabilities in ascending order. That is, the capability of Pi is better than that of Pj when i > j. We also let Di( j) be the j-th version of data object Di. Again, we order all versions of a data object according to their quality in ascending order, which means that the quality of Di( j) is better than that of Di(k) when j > k. For each data object, we assume that the data size of a version with higher quality is larger than that of another
6 5 4 3 2 1
P3 P2 P1
Figure 6: Example device profiles version with lower quality.
To facilitate the following discussion, the concept of viewable version set is defined below.
Definition 1: A viewable version set of a device profile Piand a data object Dj (denoted as VV S(i, j)) is a set of versions of Djwhich are able to be displayed by mobile devices with profile Pi.
Then, we have the following example.
Example 1: Consider the example shown in Figure 6. Mobile devices are classified into three cat- egories: notebook, PDA and smart phone, and their capabilities are described in device profiles P3, P2 and P1, respectively. In addition, there are six versions of data object Dj. VV S(3, j), VV S(2, j) and VV S(1, j) are {3, 4, 5, 6}, {3, 4} and {1, 2}, respectively. We have VV S(2, j) ⊂ VV S(3, j) since devices with profile P3 (e.g., notebooks) are capable of displaying all versions of Dj viewable by devices with profile P2 (e.g., PDAs). On the other hand, we have VV S(3, j)TVV S(1, j) =φ and VV S(2, j)TVV S(1, j) =φ since devices with profile P1 (e.g., smart phone) employ special data for- mats (e.g., WML and WBMP) that are not supported by devices with profile P2and P3.
Let the function BEST (i, j) = k (respectively, W ORST (i, j) = k) represent that the best (respec- tively, worst) viewable version of data object Dj for a mobile device with device profile Piis version k.
In practice, we have BEST (i, j) ≥ BEST (l, j) and W ORST (i, j) ≥ W ORST (l, j) when i > l. We also have BEST (i, j) = max {VV S(i, j)} and W ORST (i, j) = min {VV S(i, j)}.
Example 2: Consider the example shown in Figure 6. The best viewable versions of P3, P2 and P1 are Dj(6), Dj(4) and Dj(2), respectively. As a result, we have BEST (3, j) = 6, BEST (2, j) = 4 and BEST (1, j) = 2. In addition, we also have W ORST (3, j) = 3, W ORST (2, j) = 3 and W ORST (1, j) = 1.
Average Waiting Time Estimation Start
Next iteration Version Decision Policy
Configuration
Service Admission Control Scheme Configuration
Figure 7: The flowchart of scheme ODB-QoS- Index
LIGHT FAIR
System Load (ρ)
Average Access Time (W) HEAVY
ρ1 ρ2 1
Figure 8: The relationship between load and average access time of a queue
When a user registers the service, the user’s mobile device will transmit the identifications of the user and the corresponding device profile to the server. Suppose that the device profile of the mobile device is Pi. Then, when the mobile user requests Dj, the server will return a suitable version of Dj, say the k-th version of Dj where k ∈ VV S(i, j), according to the result of the underlying version decision policy.
4 Design of Scheme ODB-QoS-Index
An overview of scheme ODB-QoS-Index is given in Section 4.1. The proposed version decision policy and admission control scheme are described in Section 4.2 and Section 4.3, respectively. Finally, the description of the proposed data indexing method is given in Section 4.4.
4.1 Overview
In this paper, we take the average waiting time of the system as the QoS metric. Before executing scheme ODB-QoS-Index, system administrators should specify a QoS requirement by setting two thresholds of average access time, W1 and W2 where W1< W2. The meanings of these two thresh- olds are as follows. The users are guaranteed to receive the best viewable versions of the requested data objects when the average waiting time is smaller than W1. On the other hand, scheme ODB-QoS-Index will try its best to prevent the average waiting time from being larger than W2.
Scheme ODB-QoS-Index is an online, iterative and adaptive algorithm which comprises a version decision policy, a service admission control scheme and a data indexing method. The flowchart of scheme ODB-QoS-Index is shown in Figure 7. Scheme ODB-QoS-Index is executed periodically, and the following three steps are executed in each iteration. First, in the average waiting time estimation step, scheme ODB-QoS-Index measures the average waiting time of each queue according to the an- alytical results derived in Section 3. Since only Queue 2 is physical, only the average waiting time of Queue 2 (i.e., WSche.) can be directly observed. In view of this, we propose an approximation al- gorithm to estimate the average waiting times of Queue 1 and Queue 3 (i.e., WCtrl. and WBCast). For better readability, the proposed approximation algorithm is described in Appendix A. Then, scheme ODB-QoS-Index measures the load of each queue based on the estimated average waiting time, and determines the current state of each queue according to the load of each queue. Finally, scheme ODB- QoS-Index configures the version decision policy and the service admission control scheme according to the state of each queue. In addition, a data indexing method is employed by the scheduler to insert index items into the broadcast program to reduce power consumption of mobile clients. The details of scheme ODB-QoS-Index are described in the following subsections.
4.2 Version Decision Policy
4.2.1 Overview
Figure 8 shows the relationship between the average waiting time and the load of a queue. It is intuitive that when the load is larger than or equal to one, the system is not stable since the average waiting time does not converge and will approach to infinity. In addition, when the load is smaller than one, the average waiting time increases as the load increases, and the increment will increase drastically when the load approaches one.
With the above observations, the rationale of our scheduling algorithm is to keep the system loads of the scheduler (i.e., Queue 2 in Figure 5) and the broadcast channel (i.e., Queue 3 in Figure 5) smaller than the predetermined thresholds at the cost of degrading the quality of requested data objects. As a consequence, when the load of the scheduler or the load of the broadcast channel is high, for each data request, the system will return the version of quality worse than the best viewable version. This
strategy has the following two effects:
1. Decrease the average waiting time of the broadcast channel (µ 1
BCast). Since the data size of a data object with lower quality is usually smaller than that of the same data object with higher quality, transmitting data objects with lower quality is able to reduce the load of the broadcast channel (ρBCast).
2. Increase the occurrence probability of request merge. Consider the device profiles shown in Figure 6, and two data requests of Dj for device profiles P2and P3, respectively. These two data requests will not be merged together when the load of the scheduler or the broadcast channel is light since the system will return the best viewable versions of Dj for P2 and P3, respectively.
When the load is heavy, the system decides to return the third version of Dj. Hence, these two data requests can be merged together, and the arrival rates of the input processes of the cache and the broadcast channel decrease. As a result, this strategy is able to reduce the load of the cache (ρSche.) and the broadcast channel (ρBCast).
The proposed version decision policy consists of three phases: state determination phase, candidate version selection phase and version decision phase. First, in state determination phase the server deter- mines the states of the scheduler and the broadcast channel according to the loads of the scheduler and the broadcast channel. Then, in candidate version selection phase, several versions, called candidate versions, are selected according to the states of the scheduler and the broadcast channel. Finally, the server decides the resultant version from the candidate versions according to the content of the request queue and the objects stored in the cache.
4.2.2 State Determination Phase
Two thresholds,ρ1Sche.andρ2Sche.(respectively,ρ1BCast andρ2BCast), are specified to divide the load of the scheduler (respectively, the broadcast channel) into three states: LIGHT, FAIR and HEAVY. Figure 9 shows the state transition diagram of the scheduler. The state transition scenarios are as follows. When the previous state is LIGHT, the current state will transit to FAIR ifρSche.> (1+α)×ρ1Sche.. Otherwise, the current state will still be LIGHT. When the previous state is FAIR, the current state will transit to
FAIR1 FAIR2
) 1 ( ) 1
( .
.
Sche
Sche α ρ
ρ > + × ρSche.>(1+α)×ρSche.(2)
) 2 ( ) 1
( .
.
Sche
Sche α ρ
ρ < − × )
1 ( ) 1
( .
.
Sche
Sche α ρ
ρ < − ×
otherwise otherwise
FAIRn ) 1 ( ) 1
( .
.> + × Sche n−
Sche α ρ
ρ
) 1 ( ) 1
( .
.< − × Sche n−
Sche α ρ
ρ
otherwise
FAIR
. 1
. (1 ) Sche
Sche α ρ
ρ > + × ρSche.>(1+α)×ρ2Sche.
. 2
. (1 ) Sche
Sche α ρ
ρ < − ×
. 1
. (1 ) Sche
Sche α ρ
ρ < − × LIGHT otherwise
HEAVY otherwise
Figure 9: State transition diagram
LIGHT whenρSche.< (1−α)×ρ1Sche.and transit to HEAVY whenρSche.> (1+α)×ρ2Sche.. Otherwise, the current state will still be FAIR. When the previous state is HEAVY, the current state will transit to FAIR if ρSche.< (1 −α) ×ρ2Sche.. Otherwise, the current state will still be HEAVY. The factor α, where 0 <α < 1, is used to avoid state oscillation. We assume that (1 +α) ×ρ2Sche. < 1 without loss of generality. To facilitate fine-grained control, system administrators can divide FAIR state into several sub-states. Suppose that there are n sub-states of FAIR state. The interval (ρ1Sche.,ρ2Sche.) is then divided into n partitions by n − 1 thresholds, ρSche.(1),ρSche.(2), · · · ,ρSche.(n − 1), where ρSche.(k) =
³ρ1Sche.+ k ×(ρ2Sche.−nρ1Sche.)
´
. The transition between these sub-states is similar to that between LIGHT, FAIR and HEAVY states. The state transition diagram and transition scenarios of the broadcast channel are as shown in Figure 9 by substituting ρ1BCast and ρ2BCast for ρ1Sche. and ρ2Sche., respectively. The determination of the values ofρ1Sche.,ρ2Sche.,ρ1BCast andρ2BCast is described in Appendix B.
We also define the aggregate state of the scheduler and the broadcast channel as follows. The aggregate state is LIGHT when the loads of the scheduler and the broadcast channel are both LIGHT.
The aggregate state is HEAVY when at least one of the loads the scheduler and broadcast channel is HEAVY. Otherwise, the aggregate state is FAIR. In FAIR state, the current sub-state is determined as the heaviest of the current sub-states (i.e., the heaviest load) of the scheduler and the broadcast channel. For each new-coming data request, the scheduler will decide a suitable version, fill the version information into the data request according to the aggregate state, and insert it into the data request queue. The scheduler will also inform the mobile client of the decided version by replying an acknowledgement message.
4.2.3 Candidate Version Selection Phase
Let degradation and maxDegradation indicate the suggested and maximal degrees of degradation, respectively. The value of maxDegradation is determined by
maxDegradation = max
∀Pk,Dj
{BEST (k, j) −WORST (k, j)}.
In candidate version selection phase, the server will determine a proper value of degradation according to the state of the server, and versions BEST (k, j), BEST (k, j) − 1, · · · , BEST (k, j) − degradation are called candidate versions. The procedure in candidate version select phase is described below.
• Case I: Aggregate state is LIGHT.
The scheduler operates in the traditional on-demand broadcast mode when the aggregate state is LIGHT. Hence, the server guarantees that each client will receive the best viewable versions of the requested data objects. That is, the system will return the BEST (i, j)-th version of Dj when a user requests Dj by a mobile device belonging to device profile Pi. Therefore, the value of degradation is set to zero.
• Case II: Aggregate state is FAIR.
In FAIR state, the quality of the received data objects may be degraded. Suppose that FAIR state consists of n sub-states. Then, the value of degradation is set to dk ×maxDegradation
n+1 e when the server is in the k-th sub-state of FAIR state.
• Case III: Aggregate state is HEAVY
When the aggregate state is HEAVY, the server will suggest to return the W ORST (i, j)-th version of Djwhen a user requests Dj by a mobile device belonging to device profile Pi. Therefore, the value of degradation is set to maxDegradation.
4.2.4 Version Decision Phase
In this phase, the server should pick a proper one from candidate versions (i.e., BEST (i, j), BEST (i, j)−
1, · · · , BEST (i, j) − degradation). Suppose that the incoming request is for Di. The steps of the decision
are as follows.
• Step I: In this step, the server checks the data requests in the request queue. If in request queue, there is a data request for Di, say Req, with version v, BEST (i, j) ≤ v ≤ BEST (i, j) − degradation, version v is selected since this incoming request can be merged into Req without increasing the load of the server. The server will perform step two if there is no such data request in the request queue.
• Step II: In this step, the server checks the objects stored in the cache. If there is an object Di(v), BEST (i, j) ≤ v ≤ BEST (i, j) − degradation, stored in cache, version v is selected so that the server need not neither retrieve Dvfrom its data server nor perform transcoding. Otherwise, the server will perform step three if there is no such object in the cache.
• Step III: Select the version v which is covered by the most profiles among versions BEST (i, j), BEST (i, j) − 1, · · · , BEST (i, j) − degradation. Although the server load cannot be reduced by this decision, the probability that successive requests can perform request merge will increase.
4.3 Service Admission Control Scheme
The proposed service admission control scheme consists of two phases: state determination phase and admission control phase. To perform service admission control, the server first determines the state of the control channel in state determination phase, and then determines whether to grant a service registration or a service handoff in admission control phase. The procedures of these two phases are described in the following subsections.
4.3.1 State Determination Phase
The proposed service admission control scheme is employed in each service manager to determine whether to grant a service registration or a service handoff by considering the number of users in service, the network status, and so on. The rate that a service registration is blocked is called service blocking rate (abbreviated as SBR), while the rate that a service handoff is forced to terminate is called service dropping rate (abbreviated as SDR). The rationale of our service admission control scheme is to
keep the system load of the control channel (i.e., Queue 1 in Figure 5) smaller than the predetermined thresholds at the cost of increasing SBR and SDR. To achieve this, two thresholds, ρCtrl.1 and ρCtrl.2 where ρ1Ctrl. < ρ2Ctrl. < 1, are specified to divide the load of the control channel into three states:
LIGHT, FAIR and HEAVY. The state transition diagram and transition scenario of the service manager are shown in Figure 9 by substitutingρ1Ctrl. andρCtrl.2 forρ1Sche.andρ2Sche., respectively. Similarly, the determination ofρ1Ctrl. andρ2Ctrl.is described in Appendix B.
4.3.2 Admission Control Phase
Although the proposed version decision policy can reduce the loads of the scheduler and the broadcast channel, the effect of the proposed version decision policy is limited since it depends on several factors such as the locality of data requests, the cache size and so on. As a consequence, in addition to the load of the control channel, the service admission control scheme should also take the loads of the scheduler and the broadcast channel into consideration. The procedure in admission control phase is as below.
When the load the control channel is HEAVY, the server will block all service registrations and drop all service handoffs in order to relieve the server load. When the load of the control channel is FAIR or LIGHT, the server will determine the values of two probabilities, ProbBlock and ProbDrop. Then, a service registration will be blocked with probability ProbBlock, while a service handoff will be dropped with probability ProbDrop. It is the system administrators’ responsibility to specify how to determine of the values ProbBlock and ProbDrop. Let curStateCtrl. be the current state of the control channel, and let curStateAgg.be the aggregate state of the scheduler and the broadcast channel. Note that SBR should be sacrificed first since mobile users can tolerate a service registration being blocked rather than a service handoff being forced to terminate (i.e., dropped). Therefore, in each combination of curStateCtrl. and curStateAgg., ProbBlockshould be larger than or equal to ProbDrop. An example setting for determining ProbBlock and ProbDropin an environment with three sub-states in FAIR state is given in Table 2.
Consider the case that the server decides to reject a service registration of a service handoff since the server’s load cannot afford it. If the owner of the service registration or the service handoff, say user i, has the same interest to other users using the service, granting this service registration or the service handoff will not increase the server load since all the user i’s requests are expected to be able to
curStateAgg.
LIGHT FAIR
HEAVY FAIR1 FAIR2 FAIR3
curStateCtrl. LIGHT 0/0 0/0 0.33/0 0.66/0.15 1/0.3 FAIR 0/0 0.25/0 0.5/0 0.75/0.3 1/0.6 ProbBlock/ProbDrop Table 2: An example setting for determining ProbBlockand ProbDrop
be merged to other users’ requests. Hence, to decrease SBR and SDR, the server should grant user i’s service registration or service handoff. From the above example, we observe that we can aggressively grant a server registration or a service handoff as long as the owner and other users are of common interest.
To measure the similarity of interest of user i and other users using the service, we define similarity factor as the probability that a user’s request will be merged to another request. When receiving a data request, the server will check whether the data request is merged into another request and update the user’s similarity factor stored in the user’s profile. The system administrators have to specify a threshold β, 0 ≤β ≤ 1, so that a service registration or a service handoff will be granted (even the server cannot afford it) as long as the value of the owner’s similarity factor is larger than or equal toβ.
4.4 Data Indexing
As shown in [18], setting degree of broadcast programs to a smaller value will make mobile devices meet index segments more quickly, thus reducing energy consumption. However, it is true only in the cases that turning on and turning off WNIs do not consume energy. As pointed out in [24], in reality turning on and turning off the WNIs consume some time and energy, and the transition times of a WNI from active mode to doze mode and from doze mode to active mode are both on the order of tens milliseconds.
Consider two organizations of index and data items shown in Figure 10. Note that the time periods marked as ‘A’ and ‘D’ indicate the time periods that the mobile device is in active and doze mode, respectively, while the time periods marked as ‘F’ and ‘N’ indicate that the time periods that the mobile device in turning off and turning on the wireless network interfaces (abbreviated as WNIs). Suppose that a mobile device tunes to the broadcast channel at time tStart and finishes the retrieval of the desired
I1 I2 I3 I4 D1 D2 D3 D4
A F N A
tStart Retrieval of D3
tEnd
D Time
(a) Example broadcast program with degree four
I1 D1 I2 D2 I3 D3 I4 D4
A F N A F N A
Time Retrieval
of D3
tEnd
tStart D D
(b) Example broadcast program with degree one
Figure 10: Example organizations of index and data items
data item at time tEnd. As observed in Figure 10, when the value of degree of broadcast programs decreases, mobile devices will switch back and forth between active and doze modes (i.e., turn on and turn off WNIs) more frequently, and therefore, may consume more energy. As a result, the value of degree of broadcast programs should be set to a proper value to minimize energy consumption of mobile devices.
In view of this, we adopt an adaptive data indexing method [16] which is able to dynamically adjust the degree of broadcast programs according to system workload. The employed data indexing method consists of two phases, statistics collection phase and degree adjustment phase, and switches back and forth between these two phases periodically. In statistics collection phase, the system collects the arrival time, finish time and other statistical information of each served data request. Then, in the successive degree adjustment phase, the server determines a proper value of degree of broadcast programs according to the collected information. For the interest of space, we omit the description of the determination of the value of degree of broadcast programs. Interested readers can refer to [16] for details.
After determining the current value of degree of broadcast programs, the server then generates the broadcast program accordingly. Since the data items are of different sizes, we use the parameter budget, which is defined as the maximal length of the data segments of all buckets, to control the length of each bucket. Initially, the bucket is empty and the scheduler fetches as many data items as possible from the cache under the constraint that the summation of the sizes of the fetched data items is
smaller than or equal to budget. In addition, the scheduler marks the fetched data items as LOCKED.
Then, the scheduler inserts the corresponding index items in front of these data items. Finally, the scheduler broadcasts the index and data items in the bucket sequentially. An index item or a data item is removed from the bucket once it has been broadcast. In addition, the state of a data item which has been broadcast is marked as UNLOCKED. The above procedure repeats until the bucket becomes empty. To employ data indexing, the cache replacement policy should be also modified to consider only data items in UNLOCKED states as the candidates to be replaced.
4.5 Remarks
Currently, the proposed version decision policy and service admission control scheme are designed on the goal of reducing the overall average waiting time and average tuning time. Therefore, if two users submit two data requests (each user submits one request) for the same data object at the same time, their priorities and version numbers will be the same.
It is possible to implement differentiated QoS control in the proposed architecture. For example, we can add a classifier in front of the scheduler to classify the received data requests according to some administrator-specified rules. Hence, the version decision policy is able to assign their version numbers according to their classes. In addition, when processing a service registration or a service handoff, the server first classifies the service according to the user’s profile, and then takes action according to the user’s class. Consider the case that the server receives two service registrations. Suppose that one is submitted by a VIP user, and the other is submitted by a normal user. The latter will be rejected if the server can accept only one service registration.
5 Performance Evaluation
To evaluate the performance of scheme ODB-QoS-Index, we build an event-driven simulator with SIM [5]. In order to measure the reduction of power consumption of scheme ODB-QoS-Index, we also implement scheme ODB-QoS which only employs the proposed version decision policy and service admission control scheme. Both scheme ODB-QoS-Index and scheme ODB-QoS are executed period-