TTL prediction schemes and the effects of inter-update time distribution on wireless data access

(1)

 2004 Kluwer Academic Publishers. Manufactured in The Netherlands.

TTL Prediction Schemes and the Effects of Inter-Update Time

Distribution on Wireless Data Access

YUGUANG FANG

Department of Electrical and Computer Engineering, University of Florida, 435 Engineering Building, P.O. Box 116130, Gainesville, FL 32611, USA

ZYGMUNT J. HAAS and BEN LIANG∗

Wireless Networks Laboratory, School of Electrical and Computer Engineering, Cornell University, 323 Frank Rhodes Hall, Ithaca, NY 14853, USA

YI-BING LIN

Department of Computer Science and Information Engineering, National Chiao Tung University, Hsinchu, Taiwan

Abstract. Modern mobile networks, such as GPRS and UMTS, support wireless data applications. One successful example is the ever

popular i-Mode in Japan. Wireless data services (wireless Internet) become more important as more and more customers of handheld devices enjoy the convenience of the ubiquitous computing. To improve the effective wireless data access, the time-to-live (TTL) management for data entries becomes important due to its use in effective caching design. In this paper, we study three TTL prediction schemes and investigate the effects of the inter-update time distribution on the wireless data access. Performance analysis is carried out via simulations as well as analytical modeling. We expect our results will be useful for the future wireless data access systems, in which transmission power for mobile devices is more limited.

Keywords: time-to-live (TTL), weakly consistency, wireless data, caching

1. Introduction

Modern mobile networks, such as GPRS and UMTS [8], sup-port wireless data applications. Examples, such as the popular i-Mode in Japan, have received a great deal of attention due to their success in providing some Internet services. The stan-dard Wireless Application Protocol (WAP) [6,8] is tailored for web accessing, which represents the first step towards the wireless Internet. In the wireless Internet environment, a mo-bile customer may use a wireless handheld device to access data services from the application server through the mobile network. In fact, mobile users have become the fastest grow-ing community of web users in the last few years. Already, many cellular phones are equipped with web browsing capa-bilities, and it is predicted that the number of wireless Inter-net devices will outnumber desktop computers by 2003. As another example, a user may access Web using Palm Pilot through a wireless data service such as Omnisky [11]. Om-nisky is supported by Cellular Digital Packet Data (CDPD) [4,8] with rates varying from 5 Kbps to 13 Kbps. To provide convenient services to mobile customers, web site personal-ization techniques have been developed [9] to automatically adapt and personalize web sites to mobile customers.

One of the challenging design tasks in such an environ-ment is how to make data ubiquitously available, while min-imizing the transmissions from mobile devices (to save bat-tery power). An application running on the wireless handheld ∗_{Corresponding author. Ben Liang is now with the Department of Electrical}

and Computer Engineering, University of Toronto, Ontario, Canada. E-mail: liang@ece.cornell.edu

device may repeatedly access a data entry received from the application server. If the data entry is not sensitive to time, then the customer may access the data stored in the cache of the wireless handheld device instead of querying the applica-tion server, and the expensive wireless transmission overhead is reduced. Effective caching strategies should be used for such applications. If the data entry is sensitive to time, then the current data entry should be provided from the application server. In this case, it is better to push such a data entry to the mobile device before it is queried, because the transmission power from mobile devices tend to be higher and thus, more expensive, than the receiving power. Therefore, it is reason-able to handle sensitive wireless applications and time-insensitive applications in a different fashion. One way to do so is to use the timers (time-stamps or time-to-live) for data entries.

Some time-sensitive wireless applications can tolerate cer-tain degree of inaccuracy (e.g., most web page requests and location dependent information in wireless applications). For this type of applications, we can set an expiration period t to predict when the data entry will be updated. During the pe-riod t, the data entry in the cache of the handheld device is used. When t expires, the next data access results in a query to the mobile network. In this case, the application is weakly

consistent, where the wireless handheld device may

occasion-ally access the stale data. A mechanism is required to predict when a data entry expires. In Apache [1] and Squid [15], a time-to-live (TTL) interval t is defined for data entries stored in the wireless handheld device. The TTL for a data entry is determined based on whether the data entry is modified due

(2)

to either a mobile query or a server update, which leads to a simple TTL prediction algorithm.

Another important application of data TTL prediction is web page hosting. In web page hosting, a page may be repli-cated on many servers, so as to spread the access load and reduce congestion. It can also lead to more reliable systems. The replica pages need to be updated according to the time-sensitivity, update pattern, and access pattern of the original page. This is a similar problem to the update prediction prob-lem that we study in this paper. In this probprob-lem, the “server” is the main location of the web page and the “clients” are the replicated locations. The goal is to minimize the traffic of fre-quently updated pages and the penalty of providing out-dated data.

Note that the TTL prediction mechanism is typically ex-ercised with cache replacement such as LRU (least recently used) and LFU (least frequently used) [3] in a proxy cache for WWW accesses. Since the storage of a handheld device is limited, the wireless application may determine that no cache replacement algorithm is exercised for frequently accessed data (or they are likely to be replaced by infrequently accessed data). That is, when a wireless handheld device runs a partic-ular application, some data used by this application are con-sidered as “frequently accessed” and will always be kept in the handheld device until they expire. This is especially true for some location dependent services provisioned by the mo-bile operators. The customer may also enable a data entry as “frequently accessed”, and the handheld device will not exer-cise cache replacement for this data entry until the frequently accessed indication is disabled. In Squid [15], the TTL option can be specified by users, so that users can control caching for certain applications.

In this paper, we study three TTL prediction schemes for wireless data access. We also propose analytical and simu-lation models for studying the performance analysis for the TTL prediction mechanisms. Our models are flexible enough to accommodate any fudge factor values used to generate the TTL interval. Since GPRS traffic reported by mobile opera-tors indicates that the traditional web access patterns do not apply to the GPRS-based wireless data access, our model con-sider general distributions for data update and access. These distributions can be used to approximate data obtained from GPRS field operations or trials. Based on our model, we show how the inter-update time distribution affects the accuracy of TTL interval prediction.

2. TTL prediction schemes

In this section, we describe three schemes to determine the proper TTL interval when the handheld device queries the server. This series of schemes require increasing record-keeping of the history of the server inter-update intervals.

TTL scheme #1. This scheme is based on the implementa-tion of the Apache and the Squid systems. When the handheld device queries the server, the server data entry either has been

modified or remains the same as the cached one. In the former case, it is assumed that the server returns the updated data en-try with a timestamp indicating when the data enen-try was last modified. In the latter case, the server returns a positive ac-knowledgment of the cache validity. Thus, the handheld al-ways knows the time of the last server update. Let Tbbe the

difference between the time of the query and the time of the last server update. Then, in this scheme, the TTL interval is given by

TTTL₁ = cfTb, (1)

where cf is a system defined fudged factor.

TTL scheme #2. In this scheme, it is assumed that the server remembers the length of the previous inter-update interval, denoted by Tp. When the handheld device queries the server,

the server sends back the value of Tp as part of the reply.

Then, the TTL interval for the current query is given by

TTTL2 = cfTp. (2)

TTL scheme #3. As proposed in [14], a running average of the inter-update intervals can be obtained by a handheld de-vice. If the server maintains this average, it can be sent to the handheld device during its query to the server. The exact method of obtaining this average, including the designs of the windowing duration and the weights of averaging, is outside of this paper’s scope. Let Tedenote the average inter-update

interval. Then, in this scheme, the TTL interval for the current query is given by

TTTL3 = cfTe. (3)

All of these schemes employ an intuitive form, of a fudged factor multiplying a time duration indicative of how often the data entry is updated at the server. For example, in the scheme

TTL1, if the date entry is updated infrequently, the backward residual time of server updating, Tb, is likely to be large, while

if it is updated frequently, Tbis likely to be small. Therefore,

we can use Tbto estimate, albeit coarsely, the time of the next

server update. The rationale behind using Tpand Tein TTL1 and TTL2 is similar. Furthermore, assuming that the server updating process is stationary, Teis the clearly best real value

estimate of the inter-update time.

One main advantage of using TTL1is that, on the server side, it requires no extra equipment or processing overhead above what is already implemented in many of the currently deployed systems. For example, the time of the last update is built into the Hypertext Transfer Protocol (HTTP). However, as shown in the next section, this TTL prediction may not be as accurate as the other two schemes.

The schemes TTL2and TTL3, on the other hand, require the server to remember its past updates and share that infor-mation with the handheld device. In particular, a server sup-porting TTL3may need to maintain a record of its long-term updating history. However, since Tegives a more consistent

estimate of the server’s next updating time, we expect TTL3 to outperform TTL1and TTL2in terms of data access cost.

(3)

In what follows, we will study the performance of these schemes and the effect of the server inter-update interval dis-tribution on the cost of wireless data access.

3. Assumptions and output measures

In this section, we describe the assumptions used in this pa-per and the pa-performance measures used to evaluate the TTL-interval prediction schemes. Consider the TTL-interval between two consecutive queries from the wireless handheld device to the server. This interval is referred to as a cycle. In figure 1, [τ2, τ0) is a cycle. The access at the beginning of a cycle (e.g., τ2in figure 1) results in a query to the server. During (τ2, τ0), the handheld device returns the cached copy to all lo-cal accesses by applications to the data entry. We assume that accesses to a data entry form a point process with general dis-tribution. Furthermore, the inter-update intervals are assumed to be a random variable with a general distribution. Based on these assumptions, we consider the following primary perfor-mance measures:

• The expected number, E[K1], of non-stale accesses in a cycle. For a non-stale access, when the access occurs, the data entry in the cache is the same as that in the server. Note that the non-stale accesses include the one that results in the query to the server at the beginning of a cycle. • The expected number, E[K], of accesses in a cycle. This

number includes the stale and the non-stale accesses in the cache, plus the access resulting in a query from the hand-held device to the server (for the cycle[τ2, τ0)in figure 1, this query occurs at τ2). Thus, K 1 always holds. • The probability β that when the handheld device queries

the server, the data entry is valid (i.e., the data entry has not been modified since the last query).

It is clear that the handheld device communicates with the server for every E[K] access. Based on E[K1] and E[K], we can investigate the accuracy of TTL interval prediction through the staleness ratio ps, which is the probability that

the handheld device returns a stale data entry for an access. That is,

ps = E[K] − E[K1]

E[K] . (4)

Figure 1. The timing diagram.

Thus, we can define the cost due to data staleness as

Cstale= γps, (5)

where γ represents the penalty of returning a stale data entry to the application.

Another performance measure considered in this paper is the server query cost or wireless transmission cost Cquery. Suppose that the cost of transmitting a data entry is one unit. We further denote δ as the cost for the handheld device to query the server without the data entry being transmitted. It is clear that 0 < δ < 1.

When the handheld device queries the server and the data entry is valid, the server returns a positive acknowledgment with the cost δ. If the data has been modified, the server re-turns the updated data entry to the handheld device and the transmission cost is one unit. Note that on average, a query to the server occurs for every E[K] accesses. That is, the trans-mission costs for the E[K] − 1 accesses in the cycle are 0. Thus, if we normalize the cost (e.g., wireless transmission delay) for a query with data transmission as one unit, then the server query cost per access can be expressed as

Cquery=

δβ+ (1 − β)

E[K] =

1− (1 − δ)β

E[K] . (6)

Then, the total cost per access is

Caccess= Cstale+ Cquery = γE[K] − E[K1]

E[K] +

1− (1 − δ)β

E[K] . (7)

Obviously, shorter TTL intervals lead to smaller stale data probability ps, and hence lower Cstale. However, shorter TTL intervals also create more queries to the server, which may lead to higher Cquery. Ideally, if one has the exact informa-tion of the future server update times, the TTL should be set to expire at the next server update instant. However, in prac-tice, one can only predict the actual next server update instant based on the known statistics and then set the TTL appropri-ately. In the previous section, we present three TTL prediction schemes. Next, we study the effect of TTL selection on the cost of wireless data access.

4. Performance study

Analytical modeling of the three TTL schemes under certain simplified assumptions is possible. For example, for the third TTL prediction scheme, we are able to completely character-ize the performance measures. For the first and the second TTL prediction schemes, we are also able to provide some approximate analytical results. All analytical results and their derivations are presented in the appendices. In what follows, we will evaluate the performance of the three TTL prediction schemes and the effects of the inter-update probability distri-bution via both, simulations, as well as the analytical results.

(4)

4.1. Simulation setup

Simulations are carried out in Matlab to evaluate the TTL pre-diction schemes for wireless data access. The following three example distributions of the server inter-update intervals are studied:

• exponential, • Rayleigh, and • deterministic.

These three distributions represent a gradient of increasing memory level, from memoryless, in the case of the exponen-tial distribution, to fully future knowledgeable, in the deter-ministic case.

The arrival process of accesses to a data entry is assumed to have a general distribution. Previous studies suggest that the arrivals of the Internet dial-up access connections can be described by a Poisson model [10]. However, published re-sults based on the wireline Web access trace indicate that the user requests to a document on the Web do not follow a pure Poisson process [12]. Currently there is no access trace avail-able for wireless data. Furthermore, wireless data access is affected by mobility and will, most probably, not follow the access patterns observed in the wireline networks. In the fol-lowing, we first consider the Poisson access stream for mean value analysis and then summarize our simulation results with general access patterns.

In each simulation, the mean of the inter-update intervals is set to have one time unit, the data accesses are assumed to occur five times as frequent as the server updates, and 10000 server updates are simulated. For each inter-update distribu-tion, the three TTL schemes are studied separately. For each scheme, the fudge factor, cf, is allowed to vary from 10−3to

103. The optimal cf is obtained through observations, and the

corresponding optimal cost per data access is recorded. In figures 2–4, we plot the optimal costs over δ and γ , where δ has the range between 0.1 and 1, and γ has the range between 1 and 10. In addition, our experiments have shown that, when δ is below 0.1, since the query cost is very low when the cache is valid, the cost per data access can be triv-ially minimized by querying the server at almost every data access. Furthermore, when γ is below 1, since the penalty against stale data access is very low, the cost per data access can be trivially minimized by seldom querying the server at all. Finally, when γ is above 10, since the penalty against stale data access is very high, the best scheme is, again, to query the server at almost every data access. All of these extreme cases are independent of the TTL scheme, and therefore, are not of interest in this study.

4.2. Comparison of TTL schemes

As shown in figures 2–4, for all server inter-update distrib-utions, TTL3outperforms TTL1by a substantial amount if δ is large. This is reasonable, considering TTL3 requires the knowledge of the average of the inter-update intervals taken over a relatively long period.

(a)

(b)

Figure 2. Exponential updating, optimal cost per access vs. (a) γ ; (b) δ. For servers with the exponential inter-update distribution, which has the memoryless property, TTL1 slightly outper-forms TTL2, especially when δ is large and when γ is small. However, for most other cases, TTL2outperforms TTL1. In particular, for servers with the deterministic inter-update dis-tribution, TTL2 is equivalent to TTL3. Thus, we can infer that the relative performance of TTL1, as used in Apache and Squid, suffers as the memory level of the inter-update interval distribution increases.

4.3. Cost sensitivity to δ and γ

Figures 2–4 also suggest that, for all inter-update distributions and all the three TTL schemes, the data access cost is very sensitive to the non-update query cost δ. Adjusting the la-beling of the plot axes reveals that the cost is almost linearly increasing with δ. Therefore, it is important for a system de-signer to ensure that δ is kept small for cost effective data caching.

(5)

(a)

(b)

Figure 3. Rayleigh updating, optimal cost per access vs. (a) γ ; (b) δ. On the other hand, the data access cost is less sensitive to the stale data penalty γ . This is due to the optimal adjustment of the fudge factor cf, such that queries are performed more

frequently when γ is large. In systems where δ is small, one can afford to query the server more often, as the average cost per query is reduced in this case.

In fact, when δ is sufficiently small, all the TTL schemes perform similarly, regardless of the inter-update distribution. As shown in these figures, when δ < 0.3, the access cost of the three schemes is within 15% of each other. This can lead to the simplified numerical evaluations of some TTL schemes. For example, although TTL1is the most often used scheme in practical applications due to its minimal requirement on the server, it is generally hard to precisely analyze the perfor-mance of TTL1. However, the performance of TTL3 can be accurately analyzed numerically. Therefore, given a cached data access system that employs TTL1, one can first obtain an accurate cost estimate of TTL3and then apply that result as a close approximation of the actual cost of TTL1.

(a)

(b)

Figure 4. Deterministic updating, optimal cost per access vs. (a) γ ; (b) δ. Towards this end, appendix A provides an analytical framework for evaluating the cost of TTL3. In addition, ap-pendix B gives two more methods that approximately com-pute the cost of TTL1under a set of assumed conditions. 4.4. Non-Poisson data accesses

The above simulations were repeated assuming data ac-cess streams with Rayleigh and with deterministic inter-arrival intervals. In both cases, we observed the same patterns of the cost comparison among the different TTL pre-diction schemes and of the cost sensitivity to δ and γ . For brevity, the redundant simulation results are not presented here.

5. Conclusions

In this paper, we study three prediction schemes for setting up the time-to-live (TTL) for data entries in wireless data

(6)

access. Due to the fact that the measurements of the actual wireless data access is not available for the inter-update time, we use a general distribution model for the inter-update time and carry out the performance analysis. The effects of the inter-update time distribution on the performance of wireless data access under the three TTL prediction schemes are in-vestigated via the simulations, as well as through analytical modeling. The study shows that the TTL prediction scheme outperforms the currently used TTL prediction scheme used in Apache and Squid when the query cost from mobile de-vice is high, while most schemes perform similarly when the query cost is low. We expect our results to be useful for wire-less data access systems in which mobile transmissions are more costly.

Appendix A. Numerical analysis for TTL scheme #3 In this section, we present an analytical framework for evalu-ating the performance of the TTL scheme #3, where the TTL interval depends on the mean duration of the previous inter-update intervals.

Assume that the data accesses create a Poisson stream with rate λ. Let random variable Y represent the inter-access time, then Y has an exponential density function f (y), where

f (y)= λe−λy and E[Y ] = 1

λ. (A.1)

Let random variable Z represent the inter-update time, which has a general cumulative distribution function F (z), density function f (z), Laplace transform f∗(s) and mean E[Z] =

1/µ. Suppose that Z is a non-lattice random variable and

E[Z2] < ∞. Since the queries are independent of the server

updates, the residual life X of Z has the cumulative distribu-tion funcdistribu-tion R(x), density funcdistribu-tion r(x), and Laplace trans-form r∗(s), where from [13]

r(x)= µ1− F (x), (A.2) r∗(s)= µ s 1− f∗(s). (A.3) Suppose that TTTLhas probability density function rf(tf), the

cumulative distribution function R(tf), and the Laplace

trans-form r_j∗(s). Recall from section 2 that, in this scheme, the TTL interval is

TTTL= cf

µ, (A.4)

where cf is the fudge factor. Then,

rf(tf)= δ t−cf µ , and r_f∗(s)= e(cf/µ)s_. _(A.5)

Consider the interval t1 between when the TTL interval expires and when the next data access arrives. Let f1(t1)and f₁∗(s)denote the probability density function and the Laplace transform of the random variable t1, respectively. From the

memoryless property of the exponential distribution, t1 has the same distribution as Y , thus we have

f1(t1)= λe−λt1 and f1∗(s)= λ

s+ λ. (A.6)

Let ξ = TTTL + t1, and let fξ(τ ) and fξ∗(s)denote its

probability density function and its Laplace transform. Then from (A.5) fξ(τ )= f1 t−cf µ and f_ξ∗(s)= f₁∗(s)e−(cf/µ)s_. (A.7) The probability β is derived as follows:

β= Pr(TTTL+ t1 X) = Pr(ξ X) = _∞ x=0 Pr(ξ x)r(x) dx = _∞ x=0 1 2π i c+i∞ s=c−i∞ _f∗ ξ(s) s esxds r(x)dx = 1 2π i c+i∞ s=c−i∞ _f∗ ξ(s) s _∞ x=0 r(x)esxdx ds = 1 2π i c+i∞ s=c−i∞ _f∗ ξ(s) s r∗(−s) ds = 1 2π i c+i∞ s=c−i∞ f₁∗(s)e−(cf/µ)s s r∗(−s) ds = 1 2π i c+i∞ s=c−i∞ _f_∗ 1(s)e−(cf /µ)s −s2 1− f∗(−s)ds = µ p∈σf Res s=p f₁∗(s)e−(cf/µ)s s2 1− f∗(−s), (A.8) where σf denotes the set of poles of f∗(−s) and Ress=p

de-notes the residue at the pole s = p.

If Z, and hence X, is exponentially distributed, we have

F∗(s)= µ/(s + µ). Then, we have β= µ Res s=µ _f_∗ 1(s)e−(cf/µ)s s2 1− µ −s + µ = µ _f_∗ 1(s)e−(cf /µ)s s s=µ =λe−cf µ+ λ.

Now we derive E[K1] as follows. Consider figure 1. Af-ter τ0, if the TTL interval expires earlier than the next up-date, then all data accesses occurring in [τ0, τ0+ TTTL)are non-stale. On the other hand, if the TTL interval expires af-ter the next update, then non-stale accesses occur in period [τ0, τ0+ x]. Thus, we conclude that during a cycle, the

(7)

non-stale accesses occur in the period Tmin = E[min(TTTL, X)]. Since the accesses are a Poisson stream, from [13],

E[K1] = 1 +

E[Tmin]

E[Y ] . (A.9)

The probability density function of Tminis

rmin(tm)= − d dtm Prmin{TTTL, X} tm = − d dtm Pr(TTTL tm)Pr(X tm) = _∞ x=tm rf(tm)r(x)dx+ _∞ tf=tm rf(tf)r(tm)dtf.

The Laplace transform of Tminis

r_min∗ (s)= _∞ 0 rf(t) _∞ t r(τ )dτ e−stdt + _∞ 0 r(t) _∞ t rf(τ )dτ e−stdt = _∞ 0 rf(t) 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z e zt_dz e−stdt + _∞ 0 r(t) 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z e zt dz e−stdt = 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z _∞ 0 rf(t)e−(s−z)tdt dz + 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z _∞ 0 r(t)e−(s−z)tdt dz = 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z r ∗ f(s− z) dz + 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z r ∗_(s_{− z) dz,} _(A.10) where σ is a sufficiently small positive number.

Applying the Residue theorem, we obtain the expected value E[Tmin]

E[Tmin] = −r_min∗(1)(0) = − 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z r ∗(1) f (−z) dz − 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z r ∗(1)₍_{−z) dz} = p∈σrf Res s=p 1− r∗(s) s r ∗(1) f (−s) = p∈σr Res s=p 1− r_f∗(s) s r ∗(1)₍_−s), _(A.11) where σr is the set of poles of r∗(−s) and σrf is the set of

poles of r_f∗(−s) in the strict right-half complex plane, and

where we also have used the following fact that the derivative

g(1)(s)of function g(s) shares the same set of poles except the multiplicities. We can express E[Tmin] in terms of the

function f∗(s) via the equations (A.2)–(A.5), however, we omit such expressions here due to their complexity.

If the inter-update time Z is exponential, then f∗(s) = r∗(s) = µ/(s + µ) and r∗(1)(−s) = −µ/(s − µ)2. From equation (18), we obtain E[Tmin] = − cf µ p∈σrf Res s=p 1 s+ µe (cf/µ)s + p∈σr Res s=p 1− e−(cf/µ)s s r ∗(1)₍_−s) =cf µ sRes=−µ 1 s+ µe (cf/µ)s − Res s=µ 1− e−(cfµ)s s µ (s− µ)2 =cf µ e (cf/µ)s s=−µ − µ d ds 1− e−(cf/µ)s s s=µ (a) (b)

(8)

=cf µ e −cf +1− (1 + cf)e −cf µ =1− e−cf µ , (A.12)

which is consistent with the direct computation.

The cost per wireless data access, in the case of TTL pre-diction scheme #3, is computed based on the preceding ana-lytical framework and compared with the simulation results. Figure 5 illustrate this comparison in scenarios with the same parameters as those in figure 2. These plots validate the sim-ulation model against the analytical approach.

Appendix B. Approximating the cost of TTL schemes #1 and #2

This section describes an analysis framework that provides the approximate cost of wireless data access using TTL schemes #1 and #2. Since the analysis of these two schemes is very similar, in what follows, we concentrate on the analy-sis of TTL scheme #1. For scheme #2, we only provide brief pointers where the analysis differs.

In this approximation, we have made the following two assumptions.

Assumption #1. The handheld queries to the server are suf-ficiently independent of the server updates, such that the queries can be considered random observers of the inter-update interval.

Assumption #2. At the time of every query, the TTL inter-val is independent of the forward residual life of the current server inter-update interval.

These assumptions are reasonable in systems with nearly-exponential server inter-update intervals and relatively infre-quent queries.

Consider the timing diagram in figure 1. Assume that the data accesses are a Poisson stream with rate λ. Let the random variable Y represent the inter-access time. Then Y has an exponential density function f (y), where

f (y)= λ e−λy and E[Y ] = 1

λ. (B.1)

Let random variable Z represent the inter-update time, which has a general cumulative distribution function F (z), density function f (z), Laplace transform f∗(s), and mean E[Z] = 1/µ. Suppose that Z is a non-lattice random variable and

E[Z2] < ∞, then the residual life X of Z has the cumulative

distribution function R(x), density function r(x), and Laplace transform r∗(s), where from [13]

r(x)= µ1− F (x), (B.2)

r∗(s)=µ s

1− f∗(s). (B.3) Let the random variable T be the interval between the previ-ous update and when the handheld device queries the server.

In figure 1, T = t = τ0− τ1. Since the queries to the server are random observer of the inter-update interval Z, T is the reverse residual life of Z. From the reversibility property of residual life [13], T has the same distribution as X (the resid-ual life of Z). In Apache [1], the TTL interval is computed as TTTL= cfT, where cf is the fudge factor. In figure 1, the

TTL interval is tf = cft. Suppose that TTTLhas probability

density function rf(tf), the cumulative distribution function R(tf), and the Laplace transform r_f∗(s). As previously dis-cussed, T has the density function r(tf)and Laplace

trans-form r∗(s), and1 rf(tf)= 1 cf r tf cf and r_f∗(s)= r∗(cfs). (B.4) Consider the interval t1 between when the TTL interval ex-pires and when the next data access arrives. Let f1(t1)and f₁∗(s)denote the probability density function and the Laplace transform of the random variable t1, respectively. From the memoryless property of the exponential distribution, t1 has the same distribution as Y , and thus we have

f1(t1)= λe−λt1 and f₁∗(s)= λ

s+ λ. (B.5)

Let ξ = TTTL+ t1 (in figure 1, TTTL = tf = cft), and let fξ(τ )and f_ξ∗(s)denote the probability density function and its Laplace transform. Then from (B.4) and (B.5)

fξ(τ )= τ tf=0 rf(tf)f1(τ− t) dtf and (B.6) f_ξ∗(s)= r_f∗(s)f₁∗(s).

The probability β is derived as follows:

β= Pr(TTTL+ r1 X) = Pr(ξ X) = _∞ x=0 Pr(ξ x)r(x) dx = _∞ x=0 1 2π i c+i∞ s=c−i∞ _f∗ ξ(s) s esxds r(x)dx = 1 2π i c+i∞ s=c−i∞ _f∗ ξ(s) s _∞ x=0 r(x)esxdx ds = 1 2π i c+i∞ s=c−i∞ _f∗ ξ(s) s r∗(−s) ds = 1 2π i c+i∞ s=c−i∞ _f∗ f(s)f1∗(s) s r∗(−s) ds =(µ2/cf) 2π i c+i∞ s=c−i∞ (1− f∗(cfs))f₁∗(s) −s3

1_{For TTL scheme #2, we have}

rf(tf)= f (tf) and rf∗(s)= f∗(s).

The rest follows exactly as the analysis of TTL scheme #1, with the above replacing (B.4).

(9)

×1− f∗(−s)ds =µ2 cf p∈σf Res s=p (1− f∗(cfs))f₁∗(s) s3 1− f∗(−s), (B.7) where σf denotes the set of poles of f∗(−s) and Ress=p

de-notes the residue at the pole s = p. If X is exponentially distributed, then we have f∗(s)= µ/(s + µ), hence we have

β=µ 2 cf Res s=µ _{[1 − f}_∗ (cfs)]f₁∗(s) s3 1− µ −s + µ =µ2 cf _{[1 − f}_∗ (cfs)]f1∗(s) s2 s=µ = 1 1+ cf λ µ+ λ .

Now we derive E[K1] as follows. Consider figure 1. Af-ter τ0, if the TTL interval expires earlier than the next update, then all data accesses occurring in[τ0, τ0+cft)are non-stale.

On the other hand, if the TTL interval expires after the next update, then non-stale accesses occur in period[τ0, τ0+ x]. Thus, we conclude that during a cycle, the non-stale accesses occur in the period Tmin = E[min(TTTL, X)]. Since the

ac-cesses are a Poisson stream, from [13],

E[K1] = 1 +

E[Tmin]

E[Y ] . (B.8)

The probability density function of Tminis

rmin(tm)= − d dtm Prmin{TTTL, X} tm = − d dtm Pr(TTTL tm)Pr(X tm) = _∞ tf=tm rf(tf)r(tm)dt+ _∞ tf=tm rf(tm)r(x)dx.

The Laplace transform of Tminis

r_min∗ (s)= _∞ 0 rf(t) _∞ t r(τ )dτ e−stdt + _∞ 0 r(t) _∞ t rf(τ )dτ e−stdt = _∞ 0 rf(t) 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z e zt_dz e−stdt + _∞ 0 r(t) 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z e zt dz e−stdt = 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z _∞ 0 rf(t)e−(s−z)tdt dz + 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z _∞ 0 r(t)e−(s−z)tdt dz = 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z r ∗ f(s− z) dz + 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z r ∗_(s_{− z) dz,} _(B.9)

where σ is a sufficiently small positive number. Applying the Residue theorem, we obtain the expected value E[Tmin]:

E[Tmin] = −rmin∗(1)(0) = − 1 2π i σ+i∞ σ−i∞ 1− r∗(z) z r ∗(1) f (−z) dz − 1 2π i σ+i∞ σ−i∞ 1− r_f∗(z) z r ∗(1)₍_{−z) dz} = p∈σrf Res s=p 1− r∗(s) s r ∗(1) f (−s) + p∈σr Res s=p 1− r_f∗(s) s r ∗(1)₍_−s) = p∈σr Res s=p/cf 1− r∗(s) s r ∗(1) f (−s) + p∈σr Res s=p 1− r_f∗(s) s r ∗(1)₍_−s), _(B.10) where σr is the set of poles of r∗(−s) and σrf is the set of

poles of r_f∗(−s) in the strict right-half complex plane, and

where we also have used the following facts that the derivative

g(1)(s)of function g(s) shares the same set of poles except the multiplicities and that r_f∗(−s) has poles in the following

form: p/cf for p ∈ σr. We can express E[Tmin] in terms of the function f∗(s)via the equations (B.2)–(B.4), however, we omit such expressions here due to their complexity. We also notice that r∗(−s), f∗(−s), and their derivatives share

the same set of poles in the strict right-half complex plane except the multiplicities, and so we can express E[Tmin] as follows: E[Tmin] = p∈σf Res s=p/cf 1− r∗(s) s r ∗(1) f (−s) + p∈σf Res s=p 1− r_f∗(s) s r ∗(1)₍_{−s). (B.11)} If the inter-update time Z is exponential, then f∗(s) = r∗(s)= µ/(s+µ) and r_f∗(s)= µ/(cfs+µ) = µf/(s+µf),

where µf = µ/cf. From equation (B.11), we obtain E[Tmin] = Res s=µf 1− µ/(s + µ) s − µf (−s + µf)2 + Res s=µ 1− µf/(s+ µf) s − µ (−s + µ)2 = − Res s=µf 1 s+ µ µf (s− µf)2 − Ress=µ 1 s+ µf µ (s− µ)2 = −µf d ds 1 s+ µ s=µf − µd ds 1 s+ µf s=µ = 1 µ+ µf = cf (1+ cf)µ , (B.12)

(10)

Next, we present another approach to compute the prob-ability β and the expectation E[Tmin] for the scheme #1. It is well known [7] that the phase-type distributions (PH) are dense in the set of all distributions in[0, ∞), i.e., any distrib-ution of a nonnegative random variable can be approximated by Phase-type distributions. The exponential distribution, the Erlang distributions, the hyper-exponential distribution, and the hyper-Erlang distribution are all special cases of PH dis-tributions. The advantage of PH distributions is that most computations are reduced to matrix manipulations. A PH dis-tribution is the disdis-tribution of the time till absorption into the absorbing state 0 in a finite state Markov chain with states {0, 1, 2, . . . , n} and with initial probability vector (α0, α)and infinitesimal generator Q= 0 0 t T ,

where α is row vector of size n and T is an n×n matrix. It can be shown [7] that this distribution can be uniquely determined by the (α, T ), so we say a random variable X is PH(α, T ) if

Xis PH-distributed with parameter (α, T ). In fact, a random variable with PH(α, T ) has the following probability density function:

f (x)= −α exp(T x)T 1T, x 0,

where 1T is the column vector of all 1’s with the same

dimen-sion as the matrix T (i.e., if T is an n× n matrix, 1T will be n-dimensional vector). We need the following result (⊗

in-dicates the Kronecker product and⊕ denotes the Kronecker sum):

Lemma [2,7].

(1) Assume that F (x) is the cumulative distribution func-tion of a random variable with PH(α, T ) with expectafunc-tion 1/µ, then the distribution µ[1 − F (x)] is also PH-distrib-uted with PH(π, T ) where π = (αT−11T)−1αT−1.

(2) Assume that the random variables X and Y are indepen-dent with PH(α, T ) and PH(ν, S), respectively, then the random variable min{X, Y } is also PH-distributed with

PH(γ , C), where

γ = α ⊗ ν, C= T ⊕ S.

(3) Assume that the random variables X and Y are inde-pendent with PH(α, T ) and PH(ν, S), respectively, then the random variable X+ Y is also PH-distributed with

PH(γ , C), where γ = [α, α0ν], C= T tν 0 S , where α0= 1 − α1T, t= −T 1T.

(4) Assume that random variables X and Y are independent with PH(α, T ) and PH(ν, S), respectively, then

Pr(X Y ) = (ν ⊗ α)(−S ⊕ T )−11S⊗ (−T 1T)

.

Now we assume that the inter-update time is PH(α, T ), then from lemma assertion (1), the residual life X is PH(π, T ) where π = (αT−11T)−1αT−1, thus we have

r(x)= −π exp(T x)T 1T and rf(x)= 1 cf r x cf = −π exp T cf x T cf 1T, which is PH(π, T /cf).

We first compute β. Since TTTLis PH(π, T /cf)and r1is exponentially distributed with PH(1,−λ), from lemma, we conclude that TTTL+ r1is PH-distributed with PH(γβ, Cβ),

where γβ = (α, 1 − α1T), Cβ =   T cf − T cf 1T 0 −λ   . Applying lemma assertion (4), we obtain

β= (π ⊗ γβ)(−T ⊕ Cβ)−1 1T ⊗ (−Cβ1Cβ) = (π ⊗ γβ)(−T ⊕ Cβ)−1 1T ⊗ 0 λ . (B.13) If the inter-update time Z is exponentially distributed, we have α= 1, T = −µ, π = 1, γβ = (1, 0), Cβ =   − µ cf µ cf 0 −λ    .

Applying (B.13), we obtain (IA denotes the identity matrix

with the same dimension as a matrix A)

β= (π ⊗ γβ)(−T ⊗ ICβ − IT ⊗ Cβ)−1 1T ⊗ 0 λ =1⊗ (1, 0) − _{−µ 0} 0 µ − _−µ/c f µ/cf 0 −λ ₋₁ × 1⊗ 0 λ = (1 0)    1+ cf cf µ −1 cf µ 0 λ+ µ    −1 0 λ = 1 1+ cf λ λ+ µ,

which is the same result as what was previously obtained. Next, we compute E[Tmin]. Again since TTTL is PH(π, T /cf)and X is PH(π, T ), where π = (αT−11T)−1αT−1.

From lemma, we conclude that Tmin is also PH-distributed with PH(γe, Ce), where γe= π ⊗ π, (B.14) Ce= T ⊕ T cf = T ⊗ IT + IT ⊗ T cf .

(11)

Therefore, from the property of PH distribution, we obtain the expectation for Tminas follows:

E[Tmin] = γe −C−1 1Ce = −(π ⊗ π) T ⊕ T cf ₋₁ 1Ce. (B.15)

If the inter-update time Z is Erlang-distributed with parameter

(m, µ), then we know it has representation PH(α, T ), where

α= (1, 0, . . . , 0), T= −mµ          1 −1 0 . . . 0 0 1 −1 . . . 0 .. . ... . .. ... ... 0 0 . . . 1 −1 0 0 . . . 0 1          = −mµT0 (B.16)

Then, r(t) will be PH(π, T ) with

π =αT−11T

₋₁

αT−1= 1

m( 1 1 . . . 1 ).

Thus, we have (AT _{denotes the matrix transpose of A):}

E[Tmin] = − 1 m21 T T ⊕ T cf ₋₁ 1 = 1 m3_µ1 T T0⊕ T0 cf ₋₁ 1 = 1 m3_µ i,j T0⊕ T0 cf ₋₁ ij , (B.17)

where 1 is a column vector of all 1’s with appropriate dimen-sion for the matrix multiplication. When m = 1, i.e., Z is exponentially distributed, we have T0= 1, hence

E[Tmin] = 1 13_µ 1+ 1 cf ₋₁ = cf 1+ cf 1 µ,

which is the same result as we obtained earlier.

Now assume that Z is hyper-Erlang distribution with the following probability density function [5]:

f (t)= M i=1 pi (miµi)mitmi−1 (mi− 1)! e−miµit_, pi 0, M i=1 pi = 1, 1 µ = M i=1 pi µi, M >0. (B.18)

From a result in [7], we know that the hyper-Erlang distri-bution f (t) has the following representation PH(αhe, The),

where αhe= (p1α1p2α2 . . . pMαM), αi= (1 0 . . . 0), i = 1, 2, . . . , M, The=       T1 T2 . ._. TM      , Ti= −miµi          1 −1 0 . . . 0 0 1 −1 . . . 0 .. . ... . .. ... ... 0 0 . . . 1 −1 0 0 . . . 0 1          mi×mi , i= 1, 2, . . . , M. From (B.15) we obtain E[Tmin] = −(πhe⊗ πhe) The⊕ The cf ₋₁ 1, (B.19) where πhe= αheT_he−11he ₋₁ αheT_he−1 = _M i=1 piαiTi−11Ti ₋₁ ×p1α1T₁−1 p2α2T₂−1 . . . pMαMT_M−1 ,

and 1 is a column vector of all 1’s with appropriate dimension for matrix multiplication (the dimension is (M_i₌₁mi)2).

As a final remark, we notice that the approach using the Residue theorem may overcome the dimension explosion in-herited in the matrix-geometric approach, however, the for-mer does not give explicit formula, while the latter does.

Acknowledgements

The work of Yuguang Fang was supported in part by the National Science Foundation Faculty Early Career Devel-opment Award under grant ANI-0093241 and the Office of Naval Research Young Investigator Award under grant N000140210464. The work of Zygmunt Haas and Ben Liang was partially funded by ONR as part of the Multi-disciplinary University Research Initiative (MURI) under the contract number N00014-00-1-0564, by AFOSR as part of the Multidisciplinary University Research Initiative (MURI) under the contract number F49620-02-1-0233, and by the NSF grants number ANI-9704404 and ANI-0081357. Yi-Bing Lin’s work was sponsored in part by MOE Program for Promoting Academic Excellence of Universities under the grant number 89-E-FA04-1-4, FarEastone, IIS/Academia Sinica, and the Lee and MTI Center for Networking Research, NCTU.

References

[1] Apache 1.3, HTTP Server Document, http://www.apache.org (2000).

(12)

[2] S. Asmussen, Matrix-analytic models and their analysis, Scandinavian Journal of Statistics 27(2) (2000) 193–226.

[3] P. Cao and S. Irani, Cost-aware WWW proxy caching algorithms, in: Proc. Usenix Sympos. Internet Technologies and Systems (1997). [4] Y.-M. Chuang, T.-Y. Lee and Y.-B. Lin, Trading CDPD availability and

voice blocking probability in cellular networks, IEEE Network 2(12) (1998) 48–54.

[5] Y. Fang and I. Chlamtac, Teletraffic analysis and mobility modeling for PCS networks, IEEE Transactions on Communications 47(7) (1999) 1062–1072.

[6] S. Helme, The new generation, Mobilecommunications Asia (January 2000) 12–16.

[7] G. Latouche and V. Ramaswami, Introduction to Matrix Analytic Meth-ods in Stochastic Modeling (SIAM, Philadelphia, 1999).

[8] Y.-B. Lin and I. Chlamtac, Wireless and Mobile Network Architectures (Wiley, 2001).

[9] P.P. Maglio and R. Barrett, Intermediaries personalize information streams, Communications of ACM 43(8) (2000) 68–74.

[10] M. Naldi, Measurement-based modelling of Internet dial-up access connections, Computer Networks 31(22) (1999).

[11] Omnisky, http://www.omnisky.com (2000).

[12] J.E. Pitkow, Summary of WWW characterizations, Computer Networks and ISDN Systems 30(1–7) (1998).

[13] S.M. Ross, Stochastic Processes (Wiley, 1996).

[14] J. Shim, P. Scheuermann and R. Vingralek, Proxy cache algorithms: de-sign, implementation, and performance, IEEE Transactions on Knowl-edge and Data Engineering 11(4) (1999).

[15] Squid 2.3, Internet Object Cache Document, http://squid. nlanr.net/Squid(2000).

Yuguang Fang received the B.S. and M.S. degrees

in mathematics from Qufu Normal University, Qufu, Shandong, China, in 1984 and 1987, respectively, a Ph.D. degree from Department of Systems, Control and Industrial Engineering at Case Western Reserve University, Cleveland, OH, in January 1994, and a Ph.D. degree from Department of Electrical and Computer Engineering at Boston University, MA, in May 1997. From 1987 to 1988, he held research and teaching positions in both Department of Mathemat-ics and the Institute of Automation at Qufu Normal University. He held a post-doctoral position in Department of Electrical and Computer Engineer-ing at Boston University from June 1994 to August 1995. From June 1997 to July 1998, he was a Visiting Assistant Professor in Department of Electri-cal Engineering at the University of Texas at Dallas. From July 1998 to May 2000, he was an Assistant Professor in the Department of Electrical and Com-puter Engineering at New Jersey Institute of Technology, Newark, NJ. From May 2000 to July 2003, he was an Assistant Professor in the Department of Electrical and Computer Engineering at University of Florida, Gainesville, FL, where he has been an Associate Professor since August 2003. His re-search interests span many areas including wireless networks, mobile com-puting, mobile communications, automatic control, and neural networks. He has published over eighty papers in refereed professional journals and con-ferences. He has received the National Science Foundation Faculty Early Career Development Award in 2001 and the Office of Naval Research Young Investigator Award in 2002. He is listed in Marquis Who’s Who in Science and Engineering, Who’s Who in America and Who’s Who in World.

Dr. Fang has actively engaged in many professional activities. He is a se-nior member of the IEEE and a member of the ACM. He is an Editor for IEEE Transactions on Communications, an Editor for IEEE Transactions on Wireless Communications, an Editor for ACM Wireless Networks, an Area Editor for ACM Mobile Computing and Communications Review, an As-sociate Editor for Wiley International Journal on Wireless Communications and Mobile Computing, and Feature Editor for Scanning the Literature in IEEE Personal Communications. He was an Editor for IEEE Journal on Selected Areas in Communications: Wireless Communications Series. He has also actively involved with many professional conferences such as ACM

MobiCom’02, ACM MobiCom’01, IEEE INFOCOM’00, INFOCOM’98, IEEE WCNC’02, WCNC’00 (Technical Program Vice-Chair), WCNC’99, and International Conference on Computer Communications and Network-ing (IC3N’98) (Technical Program Vice-Chair).

E-mail: fang@ece.ufl.edu

Zygmunt J. Haas received his B.Sc. in electrical

en-gineering in 1979 and M.Sc. in electrical enen-gineering in 1985. In 1988, he earned his Ph.D. from Stan-ford University and subsequently joined AT&T Bell Laboratories in the Network Research Department. There he pursued research on wireless communica-tions, mobility management, fast protocols, optical networks, and optical switching. From September 1994 till July 1995, Dr. Haas worked for the AT&T Wireless Center of Excellence, where he investigated various aspects of wireless and mobile networking, concentrating on TCP/IP networks. As of August 1995, he joined the faculty of the School of Electri-cal and Computer Engineering at Cornell University.

Dr. Haas is an author of numerous technical papers and holds fifteen patents in the fields of high-speed networking, wireless networks, and op-tical switching. He has organized several workshops, delivered numerous tutorials at major IEEE and ACM conferences, and serves as editor of several journals and magazines, including the IEEE Transactions on Networking, the IEEE Transactions on Wireless Communications, the IEEE Communications Magazine, and the ACM/Kluwer Wireless Networks journal. He has been a guest editor of IEEE JSAC issues on Gigabit Networks, Mobile Computing Networks, and Ad-Hoc Networks. Dr. Haas is a Senior Member of IEEE, a voting member of ACM, and the Chair of the IEEE Technical Committee on Personal Communications. His interests include: mobile and wireless com-munication and networks, personal comcom-munication service, and high-speed communication and protocols.

E-mail: haas@ece.cornell.edu WWW: http://wnl.ece.cornell.edu

Ben Liang received the B.Sc. and M.Sc. degrees in

electrical engineering from Polytechnic University, New York, in 1997, and the Ph.D. degree in electri-cal engineering from Cornell University, New York, in 2001. In the 2001–2002 academic year, he was a lecturer and post-doctoral associate at Cornell Uni-versity. He joined the Department of Electrical and Computer Engineering at the University of Toronto in 2002.

E-mail: liang@comm.utoronto.ca

Yi-Bing Lin received his B.S. degree in electrical

engineering from National Cheng Kung University in 1983, and his Ph.D. degree in computer science from the University of Washington in 1990. From 1990 to 1995, he was with the Applied Research Area at Bell Communications Research (Bellcore), Morristown, NJ. In 1995, he was appointed as a pro-fessor of Department of Computer Science and In-formation Engineering (CSIE), National Chiao Tung University (NCTU). In 1996, he was appointed as Deputy Director of Microelectronics and Information Systems Research Cen-ter, NCTU. During 1997–1999, he was elected as Chairman of CSIE, NCTU. His current research interests include design and analysis of personal com-munications services network, mobile computing, distributed simulation, and performance modeling.

Dr. Lin is an associate editor of IEEE Network, an editor of IEEE Trans. on Wireless Communications, an associate editor of IEEE Trans. on Vehic-ular Technology, an associate editor of IEEE Communications Survey and Tutorials, an editor of IEEE Personal Communications Magazine, an editor

(13)

of Computer Networks, an area editor of ACM Mobile Computing and Com-munication Review, a columnist of ACM Simulation Digest, an editor of In-ternational Journal of Communications Systems, an editor of ACM/Baltzer Wireless Networks, an editor of Computer Simulation Modeling and Analy-sis, an editor of Journal of Information Science and Engineering, Program Chair for the 8th Workshop on Distributed and Parallel Simulation, General Chair for the 9th Workshop on Distributed and Parallel Simulation. Program Chair for the 2nd International Mobile Computing Conference, Guest Editor for the ACM/Baltzer MONET special issue on Personal Communications,

a Guest Editor for IEEE Transactions on Computers special issue on Mobile Computing, a Guest Editor for IEEE Transactions on Computers special issue on Wireless Internet, and a Guest Editor for IEEE Communications Magazine special issue on Active, Programmable, and Mobile Code Networking. Lin is the author of the book Wireless and Mobile Network Architecture (co-author with Imrich Chlamtac; published by Wiley). Lin is an Adjunct Research Fel-low of Academia Sinica, Chair Professor of Providence University. He is an IEEE Fellow.