• 沒有找到結果。

AIDOA: An Adaptive and Energy-Conserving Indexing Method for On-Demand Data Broadcasting Systems

N/A
N/A
Protected

Academic year: 2022

Share "AIDOA: An Adaptive and Energy-Conserving Indexing Method for On-Demand Data Broadcasting Systems"

Copied!
35
0
0

加載中.... (立即查看全文)

全文

(1)

AIDOA: An Adaptive and Energy-Conserving Indexing Method for On-Demand Data Broadcasting Systems

Jiun-Long Huang

Department of Computer Science National Chiao Tung University

Hsinchu, Taiwan, ROC E-mail: jlhuang@cs.nctu.edu.tw

Abstract

Since only a modest improvement in battery lifetime is expected in the next few years, energy conservation is raised as a key factor of the design of mobile devices. In view of this, we propose in this paper an energy-conserving on-demand data broadcasting system employing data indexing technique.

Different from the prior work, power consumption of turning on and turning off the wireless network interfaces is considered. In addition, we also employ server cache to reduce the effect of the time to retrieve data items from the corresponding data servers. Specifically, we first analyze the access time and tuning time of data requests and propose algorithm AIDOA to adjust the degree of buckets according to system workload. Several experiments are then conducted to evaluate the performance of algorithm AIDOA. Experimental results show that algorithm AIDOA is able to greatly reduce power consumption at the cost of a slight increase in average access time and adjust the index and data organization dynamically to adapt to change of system workload.

Keywords: data indexing, on-demand data broadcasting, energy conservation, mobile information system

(2)

1 Introduction

Owing to the constraints resulting from power-limited mobile devices and low-bandwidth wireless net- works, designing a power conserving mobile information system with high scalability and high bandwidth utilization becomes an important research issue, and hence attracts a significant amount of research atten- tion. In recent years, data broadcasting is proposed to address such challenge and has been recognized as a promising data dissemination technique in mobile computing environments [1][4][5][10][11]. Most research works on data broadcasting focus on generating a proper broadcast program or designing schedul- ing algorithms to minimize the average access time, which is defined as the average time elapsed from the moment a client issues a query to the point the desired data item is read.

As shown in [17][19], only a modest improvement (about 20% ∼ 30%) in battery lifetime is expected in the next few years. Hence, energy conservation is raised as a key factor of the design of mobile devices.

Consider a Nokia 5510 which supports AAC and MP3 playing. Compared to the power consumed on music playing, the wireless network interface (abbreviated as WNI) consumes much more energy (as much as 70% of the total power in Nokia 5510) [22]. Hence, reducing the power consumption on WNIs is an effectively means to reduce the overall power consumption. Most devices can operate in two modes:

active mode and doze mode. Many studies show that the power consumed in active mode is much higher than that consumed in doze mode. For example, a typical wireless PC card, ORiNOCO, consumes 60 mW during the doze mode and 805 ∼ 1400 mW during the active mode [19]. As a consequence, in order to reduce power consumption, the mobile devices should stay in doze mode as long as possible.

To evaluate the effect of data indexing algorithms on energy conservation, tuning time, which is defined as the time that a mobile device operates in active mode in order to retrieve a data item, is introduced in [12]. Since employing data indexing will unavoidably introduce some overhead in access time, data indexing algorithms should reduce tuning time as much as possible at the cost of producing an acceptable increase in access time. Since the size of an index item is usually much smaller than that of a data item, the increment in access time is usually small. As a result, many research works study the design of data indexing algorithms in push-based data broadcasting environments [20][21]. However, most studies on on- demand data broadcasting focus on the design of scheduling algorithms [2][5] to reduce average access time, and only few of them consider the employment of data indexing in on-demand data broadcasting

(3)

Ii(1) Ii(2) Ii(d) Di(1) Di(2) Di(d)

ISi DSi

Bucketi

Figure 1: Index structure

environments [13] to reduce average tuning time.

In [13], Lee et al. proposed an indexing algorithm for on-demand data broadcast systems. As shown in Figure 1, the proposed broadcast program is made up of a series of buckets and each bucket consists of one index segment and one data segment. A data segment contains a series of data items, while an index segment consists of the index items of the data items in the corresponding data segment. For a bucket, the number of data items in the corresponding data segment is called the degree of the bucket. The information in an index item, say Ii(1), consists of the identifier of the corresponding data item Di(1), the data size of Di(1) and the time that Di(1) in bucket i will be broadcast on the broadcast channel. In addition, by the information in the current index segment, a mobile device is able to determine the broadcast time of the index segment of the next bucket.

Although inserting index items into the broadcast program is able to significantly reduce the average tuning time at the cost of a slight increase in average access time [13], however, the proposed data indexing method proposed in [13] has the following drawbacks:

• Does not consider power consumption of turning on and turning off the WNIs.

As pointed out in [18], turning on and turning off the WNIs consume some time and energy, and the transition times of a WNI from active mode to doze mode and from doze mode to active mode are both on the order of tens milliseconds. Consider two organizations of index and data items shown in Figure 2.1 Suppose that a mobile device tunes to the broadcast channel at time tStart and finishes the retrieval of the desired data item at time tEnd. Without considering power consumption of turning on and turning off the WNI, the power consumptions of organization one and organization two are equal. However, when power consumption of turning on and turning off the WNIs is considered, organization two outperforms organization one.

1The descriptions of symbols ‘A’, ‘D’, ‘F’ or ‘N’ will be given in Table 1 in Section 3.2.

(4)

I1 D1 I2 D2 I3 D3 I4 D4

A F N A F N A

Time Retrieval

of D3

tEnd

tStart D D

(a) Organization one

I1 I2 I3 I4 D1 D2 D3 D4

A F N A

tStart Retrieval of D3

tEnd

D Time

(b) Organization two

Figure 2: Example organizations of index and data items

Therefore, we argue that the design of an energy-conserving data indexing method should take power consumption of turning on and turning off the WNIs into account to obtain precise power consumption estimation. To the best of our knowledge, there is no prior work on data indexing in on-demanding broadcast considering power consumption of turning on and turning off the WNIs, thereby distinguishing our paper from others.

• Does not consider the data fetch time

Most studies on indexing in on-demand data broadcasting are under the premise that all data items are immediately available for a data broadcasting system [13]. However, as pointed out in [6], the data fetch time cannot be neglected since it is infeasible to store all data items in the local cache of the system. Hence, the traditional data broadcasting systems [5] may not perform well. As a consequence, we argue that the indexing algorithm in on-demand data broadcasting should also consider the data fetch time in order to attain higher efficiency.

• Does not adapt to change of system workload

In mobile computing environments, schemes with static degree may not be able to adapt to change of system workload. Such phenomenon shows the necessity of designing an adaptive algorithm to dynamically adjust the degree of buckets to adapt to the change of system workload. To the best of our knowledge, all prior works on data indexing in on-demanding broadcast employ static degree and none of them is able to adapt to change of system workload.

(5)

In view of this, we propose in this paper an energy-conserving on-demand data broadcasting system by employing the data indexing technique. Different from the prior work on data indexing on on-demand data broadcasting, power consumption of turning on and turning off the WNIs is considered. Specifically, we first analyze the access time and tuning time of data requests and propose algorithm AIDOA to ad- just the degree of buckets according to system workload. In essence, algorithm AIDOA consists of two phases, statistics collection phase and adjustment phase, and switches back and forth between these two phases periodically. The system collects some statistic information of all served data requests in statistics collection phase, and the collected information is used to adjust the degree of buckets in adjustment phase according to the derived analytical results. In addition, we employ server cache to eliminate the perfor- mance degradation caused by the data fetch time. We also propose a program generation algorithm and a cache replacement policy to cooperate with algorithm AIDOA. Several experiments are then conducted to evaluate the performance of algorithm AIDOA. Experimental results show that due to the dynamic adjust- ment on degree of buckets, scheme using algorithm AIDOA outperforms other schemes with static degree in most cases.

The rest of this paper is organized as follows. Section 2 describes the proposed system architecture and the power consumption model used in this paper. Section 3 shows the analytical model of the proposed system architecture. Based on the analytical model, we propose algorithm AIDOA in Section 4. In addition, the companion program generation algorithm and cache replacement policy are proposed in Section 5. Experimental results are shown in Section 6 to evaluate the performance of algorithm AIDOA, and finally, Section 7 concludes this paper.

2 Preliminaries

2.1 System Architecture

We adopt the index structure proposed in [13] and the adopted index structure is shown in Figure 1. As shown in Figure 3, the proposed system architecture consists of the following components.

• Scheduler: The scheduler is in charge of receiving and processing the data requests submitted by mobile devices. After receiving a data request, say Reqi, the scheduler will search the ready queue,

(6)

Fetcher

Cache

Scheduler Request Queue Ready Queue

Current Broadcast Bucket

Internet

Server Server

Program Generator

Data Requests Index and

Data Items

Broadcast Channel Request Channel

Pending List

Figure 3: System architecture

the pending list and the request queue sequentially to check whether there exists a data request, say Reqj, with the same required data item as Reqi. When Reqj is in the pending list, the scheduler merges Reqi into Reqj. When Reqj is in the ready queue (respectively, the request queue), the scheduler will merges Reqi into Reqj and updates the priorities of all data items in the ready queue (respectively, the request queue) according to the employed scheduling algorithm such as FIFO, LWF, RxW and so on. Otherwise, when Reqj does not exist, the scheduler will insert Reqi into the request queue and update the priorities of all data items in the request queue according to the employed scheduling algorithm.

• Fetcher: The fetcher repeatedly retrieves the data request with highest priority from the request queue, and fetches the required data item from the corresponding data server via Internet. Cache is employed to reduce the performance degradation caused by the data item fetch time. To fetch a data item, the fetcher first checks whether the required data item is cached in the local cache. If yes, the fetcher will mark the cached data item as LOCKED and insert the data request into the ready queue. Then, the fetcher will retrieve the data request with highest priority from the request queue and repeat the above procedure.

(7)

Otherwise, when the desired data item is not cached, the fetcher will submit a data request message to the data server of the required data item and insert the data request into the pending list. Then, the fetcher will check the number of pending data requests and will stop if the number of pending data requests is equal to a predetermined threshold. Otherwise, the fetcher will repeat the above procedure until the number of pending data requests is equal to a predetermined threshold or the request queue is empty.

When a data server responds with a data item, the fetcher will retrieve the corresponding data request from the pending list and insert the data request into the ready queue. In addition, the fetcher will insert the received data item into the cache. Several cached data items may be replaced by the employed replacement policy when the free space of the cache is not enough to store the received data item.

• Program generator:

The program generator employs a program generation algorithm to compose all buckets of broadcast programs. After a bucket is generated, the index and data items in the current bucket are broadcast sequentially. The program generator will start to compose another bucket after all index items and data items in the current bucket have been broadcast.

2.2 Power Consumption Model

Denote the time for a mobile device to switch the WNI from active mode to doze mode as TOn and the time to switch the WNI from doze mode to active mode as TO f f. To evaluate the power consumption of turning on and turning off the WNIs, we assume that the power consumption of a mobile device spending in time intervals TOn (respectively, TO f f) is equal to that of a mobile device staying in active mode for timeα1× TOn (respectively, timeα2× TO f f). Similar to [22], the values ofα1andα2can be obtained by profiling.

Denote the traditional (i.e., without considering the turning-on and turning-off time of WNIs) average tuning time of a data request as TTuning. To evaluate the overall power consumption, we define the effective tuning time of a data request as TTuningE f f . = TTuning+ n1×α1× TOn+ n2×α2× TO f f, where n1and n2are the numbers of times of turning on and turning off the WNI, respectively, and TTuning is the traditional tuning

(8)

ISi DSi ISi+1 DSi+2 ISj DSj

tStart tEnd

Probe Bucket Search Bucket Retrieval Bucket

Figure 4: Categories of buckets

time. To ease the presentation, we use the term tuning time to represent effective tuning time, and assume α12= 1 in the rest of this paper.

3 Analytical Model

3.1 Client Access Protocol

After submitting a data request, a mobile client will retrieve the desired data item according to the em- ployed client access protocol. We adopt the client access protocol described in [20], and the protocol consists of the following phases.

• Initial probe phase: After submitting a data request, the mobile device tunes to the broadcast channel

and listens on the broadcast channel to wait for the appearance of an index segment.

• Index search phase: The mobile device enters index search phase after retrieving an index segment.

In index search phase, the mobile device determines whether the desired data item will be broadcast in the corresponding data segment. If not, the mobile device will switch to doze mode and then switch back to active mode when the next index segment is broadcast. Otherwise, the mobile device will enter data retrieval phase.

• Data retrieval phase: If the desired data item will be broadcast in the current data segment, the mobile device will retrieve the time that the desired data item will be broadcast from the current index segment and switch to doze mode. Then, when the desired data item is broadcast, the mobile device will switch back to active mode and retrieve the desired data item.

Consider the example shown in Figure 4 that a mobile device submits a data request. Let tStart be the time that the mobile device starts to listen on the broadcast channel after submitting the data request, and

(9)

tEnd be the time that the mobile device receives the desired data item. According to the employed client access protocol, the buckets within the time interval from tStart to tEnd can be divided into the following three categories:

• Probe bucket: The bucket which tStart lies on is called the probe bucket. In Figure 4, Bucket(i) is the probe bucket. There is only one probe bucket for each data request.

• Search bucket: The bucket whose index segment is retrieved by the mobile device and whose data segment is skipped by the mobile device is called search bucket. In Figure 4, Bucket(i + 1), Bucket(i + 2), · · ·, Bucket( j − 1) are all search buckets. For a data request, there may be zero, one

or multiple search bucket(s).

• Retrieval bucket: The bucket which tEnd lies on is called the probe bucket. That is, retrieval bucket is the bucket where the mobile device retrieves the desired data item. In Figure 4, Bucket( j) is the retrieval bucket. For each data request, there is only one probe bucket. In addition, the probe bucket and the retrieval bucket of a data request may be the same or different.

3.2 Derivations of Access Time and Tuning Time

To facilitate the following derivations, we have the following assumptions:

• All data items are of equal size SD.

• The time to broadcast a data item (i.e., SBD) is larger than TOn+ TO f f

Note that both assumptions are not the limitations of algorithm ADIOA and are made only to ease the derivations in Section 3 and Section 4. Hence, they will be relaxed in Section 5 and Section 6.

In Bucket(i), denote the moment that the mobile device starts to turn on and turn off the WNI as tWakeU p(i) and tSleep(i), respectively. In addition, we also denote that the starting time and the ending time of Bucket(i) as Bucket(i).Start and Bucket(i).End, respectively. For a data request, we also partition the time interval from tStart to tEnd into several segments and each segment is marked as ‘A’, ‘D’, ‘F’ or ‘N’.

The descriptions of these four symbols are given in Table 1.

According to the relationship of the probe and retrieval buckets, a data request may be belonging to one of the following two types.

(10)

Symbol Description

A The mobile device is in active mode D The mobile device is in doze mode

F The mobile device is turning off its WNI N The mobile device is turning on its WNI

Table 1: The symbols of time frames

A F N Time

Bucket(i) Bucket(i+1)

Bucket(i-1)

Ii(1) Ii(2) Ii(d) Di(1) Di(2) Di(d)

A

tEnd tStart tSleep(i) tWakeUp(i)

D

Figure 5: A probe bucket in a Type I data request

3.2.1 Type I: The probe and retrieval buckets are the same

As shown in Figure 5, in a Type I data request, tStart and tEnd are within the same bucket. In addition, according to the employed client access protocol, tStart must be located in the index segment. Otherwise, tStart and tEnd will not be in the same bucket, and such result conflicts with the definition of Type I data requests. In order to minimize power consumption, tSleep(i) is determined as the moment that the mobile device has finished the retrieval of the corresponding index item of the desired data item, and tWakeU p(i) is determined as the moment that the mobile device has to start to turn on the WNI in order to retrieve the desired data item.

We observe from Figure 5 that one Type I data request will increase the aggregate access time of all data requests by tEnd− tStart. On the other hand, the contribution of a Type I data request on the aggregate tuning time of all data requests is determined by the length of the time interval (tSleep(i),tWakeU p(i)). When tWakeU p(i) − tSleep(i) > TO f f, the data request will increase aggregate tuning time by

tSleep(i) − tStart+ tEnd− tWakeU p(i) + TOn

= tSleep(i) − tStart+SD

B + TOn+ TO f f.

Otherwise, when tWakeU p(i) − tSleep(i) ≤ TO f f (i.e., the mobile must start to turn on the WNI before the WNI has been turned off), the time interval (tSleep(i),tWakeU p(i)) is too short to turn on and then turn off

(11)

A F N

Time

Bucket(i) Bucket(i+1)

Bucket(i-1)

A Type II.I

Type II.II tStart

tStart

tSleep(i) tWakeUp(i)

Time D

Ii(1) Ii(2) Ii(d) Di(1) Di(2) Di(d)

Figure 6: Probe buckets in a Type II.I and a Type II.II data requests

the WNI. Hence, the data request will increase aggregate tuning time by tEnd− tStart.

3.2.2 Type II: The probe and retrieval buckets are different

The time interval (tStart,tEnd) of a Type II data request consists of one probe bucket, zero, one or multiple search bucket(s) and one retrieval bucket. Next, we will derive the contributions of the probe bucket, the search buckets and the retrieval bucket of a Type II data request, separately, on the aggregate access time and aggregate tuning time of all data requests.

Probe bucket Consider the example shown in Figure 6. According to the location of tStart, Type II data requests can be divided into the following two subtypes.

Type II.I: tStart is in the index segment.

Consider a Type II.I data request. Since the desired data item is not in the probe bucket (i.e, Bucket(i)),the probe bucket of a Type II.I data request will increase the aggregate access time of all data requests by Bucket(i + 1).Start − tStart.

On the other hand, to maximize power-saving, the mobile device should start to turn off the WNI after retrieving the latest index item in ISi, and must turn on the WNI on Bucket(i + 1).Start to retrieve the first index item in ISi+1. Hence, tWakeU p(i) is equal to Bucket(i + 1).Start − TOn. As a consequence, a Type II.I data request will increase the aggregate tuning time of all data requests by tSleep(i) − tStart+ TO f f+ TOn. Type II.II: tStart is in the data segment.

When tStart is in the data segment, according to the employed client access protocol, the mobile device has to listen on the broadcast channel to wait for the appearance of the index segment of the next bucket (i.e., ISi+1). Hence, in Bucket(i), the mobile device is in active mode from tStartto Bucket(i + 1).Start, and

(12)

A F N Time

Bucket(k) Bucket(k+1)

Bucket(k-1)

Ik(1) Ik(2) Ik(d) Dk(1) Dk(2) Dk(d)

tSleep(k) tWakeUp(k)

D

Figure 7: A search bucket in a Type II data request

A F N Time

Bucket(j) Bucket(j+1)

Bucket(j-1)

Ij(1) Ij(2) Ij(d) Dj(1) Dj(2) Dj(d)

A

tEnd

tSleep(j) tWakeUp(j)

D

Figure 8: A retrieval bucket in a Type II data request

the contributions of the probe bucket of a Type II.II data request on aggregate access time and aggregate tuning time are both Bucket(i + 1).Start − tStart.

Search bucket Consider the example shown in Figure 7. In a search bucket, the mobile device operates in active mode to retrieve the index segment and starts to turn off the WNI after retrieving all index items in the index segment. Then, the mobile device has to start to turn on the WNI to ensure that the mobile device just enters active mode on Bucket(k + 1).Start. Hence, in a search bucket Bucket(k), the contributions on aggregate access time and aggregate tuning time of all data requests are

Bucket(k + 1).Start − Bucket(k).Start = d ×SI+ SD

B ,

and

tSleep− Bucket(k).Start + TOn+ TO f f = d ×SI

B + TOn+ TO f f, respectively.

Retrieval bucket Consider the example shown in Figure 8. In the retrieval bucket, the mobile device retrieves the index items in the index segment sequentially until the index item of the desired data item has been retrieved. Then, the mobile starts to turn off the WNI to wait for the appearance of the desired

(13)

data item. In order to retrieve the desired data item, the mobile device has to start to turn on the WNI so that the mobile device is able to enter active mode in the moment that the desired data item is just being broadcast. Hence, the retrieval bucket of a Type II data request will increase aggregate access time by tEnd− Bucket( j).Start. In addition, the retrieval bucket of a Type II data request will increase aggregate tuning time by

tSleep( j) − Bucket( j).Start + tEnd− tWakeU p( j) + TOn+ TO f f

= tSleep− Bucket(k).Start +SD

B + TOn+ TO f f,

when tWakeU p( j) − tSleep( j) > TOn. Otherwise, the data request increases total tuning time by tEnd Bucket( j).Start.

With the above discussions, for a Type II data request, its contributions on aggregate access time and tuning time are equal to the summations of access time and tuning time, respectively, of its probe bucket, search buckets and retrieval bucket.

4 AIDOA: Adaptive Index and Data Organizing Algorithm

With the analysis in Section 3, we propose in this section algorithm AIDOA (standing for Adaptive In- dex and Data Organizing Algorithm) to dynamically adjust the degree of buckets according to the system workload. Basically, algorithm AIDOA consists of two phases: statistics collection phase and degree adjustment phase, and switches between statistics collection phase and degree adjustment phase period- ically. In statistics collection phase, the server will keep track of information of all data requests and the recorded information will be used to guide the adaptation procedure in the successive execution of adjustment phase.

4.1 Statistics Collection Phase

In each execution of statistics collection phase, the server will collect statistic information of all data requests served in the current execution of statistics collection phase. A data request is served when the desired data item has been broadcasted.

(14)

Two data structures, StatI and StatII, are defined to store the collected information of Type I and Type II (including Type II.I and Type II.II) data requests, respectively. The details of StatI and StatII are as follows.

Details of StatI

• ReqNo: The number of Type I data requests served in the current statistics collection phase

• AggAT: Aggregate Access Time of Type I data requests served in the current statistics collection

phase

• AggTT: Aggregate Tuning Time of Type I data requests served in the current statistics collection

phase

Details of StatII

• ReqNo: The number of Type II data requests served in the current statistics collection phase

• AggATP/AggTTP: Aggregate Access/Tuning Time of Probe buckets of Type II data requests served

in the current statistics collection phase

• AggATS/AggTTS: Aggregate Access/Tuning Time of Search buckets of Type II data requests served

in the current statistics collection phase

• AggATR/AggTTR: Aggregate Access/Tuning Time of Retrieval buckets of Type II data requests

served in the current statistics collection phase

Each field, except ReqNo, of StatI and StatII has an average version with new names by replacing prefix Agg to Avg. For example, the field AvgAT of StatI indicates the average access time of all Type I data requests served in the current statistics collection phase. We also define the structure Request to indicate data requests which are merged together. Elements in the request queue, pending list and the ready queue are all instance of structure Request. An instance of structure Request is said in the server when it is in the request queue, pending list or the ready queue. The details of structure Request are as follows.

Details of structure Request

• ReqNo: The number of data requests which are merged together and are represented by the instance of Request

(15)

• AvgTIS: Average Time In Search buckets of the data requests represented by the instance of Request

After receiving a data request, the server first determines the type of this data request. If the data request is belonging to Type I, the server calculates the contributions of the data request on aggregate average and tuning time based on the analysis in Section 3.2.1, and updates StatI accordingly. Since being able to be served by the current bucket, a Type I data request will neither be merged into a structure Request nor be inserted into the request queue, the ready queue and the pending list.

On the other hand, when the data request is belonging to Type II, the server first checks whether it can be merged into an instance of structure Request in the server. If yes, the server updates the fields (i.e., ReqNo and AvgT IS) of the instance of structure Request accordingly. Otherwise, the server creates a new instance of structure Request and inserts the instance into the request queue. Finally, the server calculates the contribution on aggregate access time and tuning time of the probe bucket of the data request according to the derivations in Section 3.2.2, and updates StatIIaccordingly.

While an instance of Request, say r, is retrieved from the ready requests2, the server first calculates the average number of search buckets that each data request in r has by

AvgSBNo ← Bucket( j).start − r.AT IS d × (SD+ SI) .

The contributions of these search buckets on aggregate access time and aggregate tuning time can be obtained from the derivations in Section 3.2.2, and StatII.AggAT S and StatII.AggT T S are updated accord- ingly. The server the calculates the time that the desired data item of r can be retrieved (i.e., tEnd). Finally, with tEnd, the server calculates the aggregate contributions of the retrieval buckets of all data requests in r on aggregate access time and tuning time according to the derivations in Section 3.2.2, and updates StatII.AggAT R and StatII.AggT T R accordingly. The algorithmic form of the procedure to update StatII when an instance of structure Request is served is as follows.

Procedure RequestServed(Request r)

1: StatII.ReqNo ← StatII.ReqNo + r.ReqNo

2: AvgSNo ←Bucket( j).start−r.AT IS d×(SD+SI)

3: StatII.AggAT S ← StatII.AggAT S +

³

d ×SI+SB D

´

× AvgSBNo × r.ReqNo

4: StatII.AggT T S ← StatII.AggT T S +

³

d ×SBI + TOn+ TO f f

´

× AvgSBNo × r.ReqNo

2Readers can refer to Section 5 to see how the system retrieves instances of Request from the ready queue.

(16)

5: Calculate tEnd of r

6: StatII.AggAT R ← StatII.AggAT R + (tEnd− Bucket( j).Start) × r.ReqNo

7: Let T T R be the tuning time of r in the retrieval bucket

8: StatII.AggT T R ← StatII.AggT T R + T T R × r.ReqNo

4.2 Degree Adjustment Phase

In each execution of degree adjustment phase, the server will adjust the degree (i.e., the value of d) of buckets according to the statistic information collected in the precedent execution of statistics collection phase. Let TAccess(d) and TTuning(d) be the average access time and average tuning time, respectively, when the degree of the broadcast programs is d. For each field, the value of the average version is equal to the value of the aggregate version divided by the number of data requests. For example, the value of StatI.AvgT T is equal to StatStatI.AggT T

I.ReqNo. Then, according to the analysis in Section 3, we have

TAccess(d) = WI× (StatI.AvgAT ) +WII× (StatII.AvgAT P + StatII.AvgAT S + StatII.AvgAT R), and

TTuning(d) = WI× (StatI.AvgT T ) +WII× (StatII.AvgT T P + StatII.AvgT T S + StatII.AvgT T R),

where WI and WIIare the weights of Type I and Type II data requests, respectively. The values of WI and WII are defined as the ratios of the numbers of Type I and Type II data requests. Hence, we have

WI = StatI.ReqNo

StatI.ReqNo + StatII.ReqNo, and

WII= StatII.ReqNo

StatI.ReqNo + StatII.ReqNo.

In addition, TOverAll(d) is employed as the metric of the system performance, and is defined as

TOverAll(d) =β× TAccess(d) + (1 −β) × TTuning(d).

In the above equation,β is an administrator-specified parameter to reflect the relative importance of av- erage access time (i.e., TAccess(d)) and average tuning time (i.e., TTuning(d)). Hence, there is no optimal setting ofβ. The objective of degree adjustment phase is to determine the new value of d to minimize TOverAll(d). However, since globally minimizing TOverAll(d) is difficult, algorithm AIDOA is designed to

(17)

find the new value of d, say dNext, where TOverAll(dNext) is local minimum. That is, we will find a value of dNext so that TOverAll(dNext) is smaller than TOverAll(dNext+ 1) and TOverAll(dNext− 1). Since the exact val- ues of TAccess(dNext) and TTuning(dNext) when dNext 6= dCurr.cannot be obtained from the collected statistic information, we adopt the following approximation method to estimate TAccess(dNext) and TTuning(dNext).

Let StatIdNext and StatIIdNext be the approximations of the values of structure StatI and StatII when the degree of buckets is dNext. Then, we have the following lemmas:

Lemma 1 StatIdNext.AvgAT and StatIdNext.AvgT T can be approximated by

StatIdNext.AvgAT = StatI.AvgAT + (dNext− dCurr.) ×SI B

and

StatIdNext.AvgT T = StatI.AvgT T,

respectively.

Lemma 2 StatIIdNext.AvgAT P and StatIIdNext.AvgT T P can be approximated by

StatIIdNext.AvgAT P = SI

SI+ SD× StatII.IdNext.AvgAT P + SD

SI+ SD× StatII.IIdNext.AvgAT P,

and

StatIIdNext.AvgT T P = SI

SI+ SD× StatII.IdNext.AvgT T P + SD

SI+ SD× StatII.IIdNext.AvgT T P, respectively, where

StatII.IdNext.AvgAT P = StatII.AvgAT P + (dNext− dCurr.) × µSI

B +SD

B

,

StatII.IdNext.AvgT T P = StatII.AvgT T P + (dNext− dCurr.) ×SI B, StatII.IIdNext.AvgAT P = StatII.AvgAT P + (dNext− dCurr.) ×SD

B and

StatII.IIdNext.AvgT T P = StatII.AvgT T P + (dNext− dCurr.) ×SD B .

(18)

As mentioned in Lemma 2, setting the degree of buckets from dCurr.to dNext will increase the numbers of index and data items in each probe bucket of Type II data requests by dNext− dCurr.. Suppose that these extra index and data items are from the search buckets. Then, we have

Lemma 3 StatIIdNext.AvgAT S and StatIIdNext.AvgT T S can be approximated as

StatIIdNext.AvgAT S = AvgSBNoNext× dNext×(SI+ SD)

B ,

and

StatIIdNext.AvgT T S = AvgSBNoNext× µ

dNext×SI

B + TO f f+ TOn

, where

AvgSBNoNext = StatII.AvgAT S × B

dNext× (SI+ SD) −dNext− dCurr.

dNext ,

respectively.

Lemma 4 StatIIdNext.AvgAT R and StatIIdNext.AvgT T R can be approximated as

StatIIdNext.AvgAT R = StatII.AvgAT R + (dNext− dCurr.) ×SI B,

and

StatIIdNext.AvgT T R = StatIIdNext.AvgT T R,

respectively.

The approximations of TAccess(dNext) and TTuning(dNext) can be calculated based on the above approxi- mations. From the above lemmas, we have the following observations:

1. Increasing the value of degree will increase average tuning time in probe buckets since the number of data items in a bucket increases. In addition, increasing the value of degree will also reduce the aggregate tuning time of search buckets since the average number of search buckets decreases. The

(19)

increase of average tuning time in probe buckets and the decrease of aggregate tuning time of search buckets are, respectively, the benefit and the cost of increasing the value of degree.

2. To minimize average tuning time, decreasing the value of degree is encouraged when average access time it short. It is because that decreasing the value of degree will reduce the average tuning time in probe buckets by slightly increasing aggregate tuning time of search buckets. Such increase results from the increase in the average number of the search buckets.

We then devise procedure DegreeAdjustment to find the value of dNext where TOverAll(dNext) is local min- imum. In procedure DegreeAdjustment, the server first checks whether increasing or decreasing the value of degree will reduce the value of TOverAll(dNext). After that, the server repeatedly increases or decreases the value of degree by one until TOverAll(dNext) is local minimum. Finally, the system sets the value of de- gree (i.e., dCurr.) to the return value of procedure DegreeAdjustment. The algorithmic form of procedure DegreeAdjustment is as follows.

Procedure DegreeAdjustment

Note: The new value of d (i.e, dNext)is returned

1: if (TOverAll(dCurr.+ 1) < TOverAll(dCurr.)) then

2: δ ← 1

3: else if (TOverAll(dCurr.− 1) > TOverAll(dCurr.)) then

4: δ ← −1

5: else

6: return dCurr.

7: dNext← dCurr.

8: while (TOverAll(dNext)) < TOverAll(dNext)) do

9: dNext ← dNext

10: return dNext

4.3 Complexity Analysis

To derive the worst time complexity of algorithm AIDOA, we consider the case that no request merge occurs. Suppose that the number of received requests in one execution of statistics collection phase is n.

Then, the time complexity of one execution of statistics collection phase is O(n) since the time complex- ity of one execution of procedure RequestServed is O(1). Suppose that the maximal value of degree is dMax. The time complexity of procedure DegreeAdjustment is O(dMax). Since algorithm AIDOA executes procedure DegreeAdjustment once in each execution of degree adjustment phase, the time complexity of one execution of degree adjustment phase is O(dMax). To implement algorithm AIDOA, we have to spend

(20)

storage space to store structures StatI and StatII. Since the sizes of structures StatI and StatIIare fixed and are independent of n, the space complexity of algorithm AIDOA is O(1).

5 Design of Program Generation Algorithm and Cache Replace- ment Policy

After determining the new value of degree, the program generator will generate the successive buckets accordingly. Since data items may be cached in the server cache, the adopted program generation algo- rithm should cooperate with the employed cache replacement policy. Each cached data item is initially marked as LOCKED and only the cached data items in UNLOCK state are candidates of replacement. To facilitate the design of cache replacement policy, the system maintains a min heap Cand which stores all data items in UNLOCKED state according to their priorities. The definition of the priority of a data item will be given later in this section. Note that in this and the following section, we relax the assumption that all data items are of the same size, and denote the size of Dias size(Di) and the average data size as SD.

The server maintains a list bucket which contains the index items and data items of the current bucket.

Initially, bucket is empty. Then, the server retrieves dCurr. data items from the head of the ready queue, inserts them into bucket and marks them as LOCKED. In addition, the corresponding index items of the data items in bucket are also inserted into bucket. Then, the server broadcasts the index items and data items in bucket sequentially. Once an item has been broadcast, it will be removed from bucket. If the item is a data item, it will be marked as UNLOCKED. Once bucket becomes empty, the server retrieves dCurr.

data items from the head of the ready queue and repeats the above procedure. The algorithmic form of the proposed program generation algorithm is as follows.

Algorithm ProgramGeneration

1: while (true) do

2: bucket ←BucketGeneration()

3: while (bucket is not empty) do

4: item ←the head of bucket

5: Remove the head of bucket

6: Broadcast item

7: if (item is a data item) then

8: Mark item as UNLOCKED

9: Calculate the priority of item and insert item into Cand Procedure BucketGeneration

1: budket ← empty

(21)

2: for (i=1 to dCurr.) do

3: if (ready queue is empty) then

4: break

5: Fetch a data item (denoted as item) from the head of ready queue

6: Mark item as LOCKED

7: Append item into bucket

8: Insert the corresponding index items of the data items in bucket into the head of bucket

9: return bucket

We now consider the design of server cache. Similar to other cache replacement policies, we define an evict function to determine the cache priorities of all data items. The profit of caching a data item is defined as the overall data fetch time saving when the data item is cached, The cost of caching a data item is defined as the size of the data item. The cache replacement policy is designed to maximize the aggregate profit of all cached data items under the limitation on the aggregate cost (i.e., size) of all cached data items.

Hence, the cache priority of a data item Diis defined as below.

priority(Di) = f etch(Di) × rate(Di) size(Di) ,

where f etch(Di) is the time for the server to fetch Difrom the data server of Diand rate(Di) is the request rate of Di. When retrieving Di from the corresponding data server, the server calculates the value of f etch(Di) and stores it for further uses. The server also stores the time of the previous cache hit of Di, denoted as tPrevHit(Di). In addition, for each cache hit of Di, rate(Di) is set to

1

tCurHit− tPrevHit(Di),

where tCurHit is the time of the current cache hit of Di. After the calculation of request rate of Di, tPrevHit(Di) is set to tCurHit.

The proposed cache replacement policy is as follows. When a data item, say Di, is retrieved from the data server, it will be inserted into the cache. When inserting Di into the cache, the server first checks whether the cache is of enough free space for Di. If yes, the system stores Di into the cache, calculates priority(Di) and marks Di as LOCKED. Otherwise, the system repeatedly removes “the data item with the smallest priority among all data items in Cand” from Cand until the free space of the cache becomes enough. Then, the system stores Di into the cache, calculates priority(Di) and marks Di as LOCKED.

The algorithmic form of the proposed cache replacement policy is as follows.

(22)

Parameter Value Data object number 4000

Data object sizes Lognormal dist. (mean 7 KB) Data access probabilities Zipf dist. with parameter 0.75 Cache capacity 0.01 ×∑object size

Object fetch delay Exponential dist. withµ = 2.3

Client number 250

Service holding time Exp. dist. withµ= 10 minutes Service establishing time Exp. dist. withµ= one hour

Table 2: Default system parameters

Algorithm CacheReplacement(Di)

1: while (FreeSpace < size(Di)) do

2: Let Djbe the data item with the smallest priority among all other data items in Cand

3: Remove Djfrom cache

4: FreeSpace ← FreeSpace + size(Dj)

5: Insert Diinto cache

6: Calculate the priority(Di)

7: Mark Dias LOCKED

Suppose that the data items in Cand are organized as a min heap. In addition, let nReplacebe the number of data items to be replaced. Therefore, the time complexity of one execution of algorithm CacheReplace- ment is O(nReplace× log |Cand|).

6 Performance Evaluation

6.1 Simulation Model

We take LWF (standing for Longest Wait First) as the underlying scheduling algorithm to prioritize the data requests in the request queue and the ready queue. The server provides one request channel and one broadcast channel with network bandwidth 38.4 Kbps and 384 Kbps, respectively. Analogously to [8], we assume that there are 4000 data objects and the sizes of data objects follow a lognormal distribution with a mean of 7 KBytes. The size of a data request message and an index item is set to 128 bytes. The times to turn on and turn off the WNIs are both set to 30ms. The access probability of data objects follows a Zipf distribution, which is widely adopted as a model for real Web traces [3][7]. The parameter of the Zipf distribution is set to 0.75 with a reference to the analyses of real Web traces [7][15]. Since small objects are

(23)

much more frequently accessed than large ones [9], we assume that there is a negative correlation between the object size and its access probability. The default capacity of the cache is set to 0.01 ×∑object size and the fetch delays of data objects follow an exponential distribution with mean 2.3 seconds [8]. Similar to [16], the number of users in the network is set to 250. Service holding time and service re-establishing time for each user are set to exponential distributions with means of 10 minutes and one hour, respectively.

Service re-establishing time is defined as the time interval between the moment that a user terminates the service and the moment that the user establishes the service again. We also assume that the inter-arrival time of data requests of each user follow an exponential distribution with mean 10 seconds [14]. The value ofβ is set to 0.5 simulate the environment that average access time and average tuning time are of equal importance.

In order to evaluate the performance of the proposed degree adjustment method in algorithm AIDOA, the algorithm proposed in [13] (referred to as algorithm Static) is modified to cooperate with the cache replacement policy and the program generation algorithm proposed in Section 5. Hence, the difference between algorithm AIDOA and algorithm Static is only on the ability of adjusting the degree of buckets.

Based on algorithm Static, we devise two schemes, Static-2 and Static-8 which set the degree of buckets to two and 8, respectively, and the values of degree of buckets are fixed throughout the simulation. In addition, scheme AIDOA employs algorithm AIDOA and initializes the degree of buckets to two. Hence, scheme AIDOA will dynamically adjust the degree of buckets according to system workload. Note that all these three schemes employ server cache to eliminate performance degradation caused by the data fetch time.

6.2 Effect of Average Data Size

In this experiment, we investigate the effect of average data size on average access time and average tuning time. Average data size is set from two KBytes to 11 KBytes and the experimental results are shown in Figure 9a and Figure 9b, respectively. Due to increasing the load of the broadcast channel, it is intuitive that increasing average data size results in the increase in average access time. In addition, when average data size is large enough, the load of the broadcast channel is high, and hence, a slight increase in average data size will cause significant increases in average access time. Since the sizes of index items are much smaller than those of data items, the effect of the degrees of broadcast programs in average access time is

(24)

0 0.5 1 1.5 2 2.5 3 3.5

2 3 5 7 9 11

Average Data Size (KB)

Average Access Time (sec)

Static-2 Static-8 AIDOA

(a) Average Access Time

0 0.5 1 1.5 2

2 3 5 7 9 11

Average Data Size (KB)

Average Tuning Time (sec)

Static-2 Static-8 AIDOA

(b) Average Tuning Time Figure 9: The effect of average data size

quite small.

Although the values of degrees of broadcast programs only slightly affect average access time of all schemes, they result in significant effects in average tuning time. As shown in Figure 9b, scheme Static-8 performs well only when average data size is small, and scheme Static-2 performs well only when average data size is large. As observed in Section 4.2, increasing the value of degree will increase average tuning time in the probe bucket. In addition, increasing the value of degree also decreases the number of search buckets and hence reduces average tuning time in the search buckets. When average data size is large, average access time is also long. Employing a large value of degree will reduce average tuning time in the search buckets by reducing the number of search buckets (i.e., reducing the number of times of turning-on and turning-off the WNIs), and increase the average tuning time in the probe bucket. However, due to the trade-off between average tuning time in the probe bucket and the search buckets, the value of degree cannot be set to be too large. In addition, we can also observe from Section 4.2 that decreasing the value of degree will decrease average tuning time in the probe bucket and increase the number of search buckets.

Therefore, scheme with small values of degree outperforms schemes with large values of degree in the case with small average data size. Hence, the value of degree cannot be set to be too small, either. Different from scheme Static-2 and scheme Static-8, since scheme AIDOA is able to dynamically adjust the value of degree to a proper value according to system workload, scheme AIDOA outperforms scheme Static-2 and scheme Static-8 in most cases.

參考文獻

相關文件

– File and index layers organize records on files, and manage the indexing data

Most existing machine learning algorithms are designed by assuming that data can be easily accessed.. Therefore, the same data may be accessed

172, Zhongzheng Rd., Luzhou Dist., New Taipei City (5F International Conference Room, Teaching Building, National Open University)... 172, Zhongzheng Rd., Luzhou Dist., New

Since we use the Fourier transform in time to reduce our inverse source problem to identification of the initial data in the time-dependent Maxwell equations by data on the

After enrolment survey till end of the school year, EDB will issue the “List of Student Identity Data on EDB Record and New STRNs Generated” to the school in case the

The personal data of the students collected will be transferred to and used by the Education Bureau for the enforcement of universal basic education, school

• But, If the representation of the data type is changed, the program needs to be verified, revised, or completely re- written... Abstract

We showed that the BCDM is a unifying model in that conceptual instances could be mapped into instances of five existing bitemporal representational data models: a first normal