Modeling mobility database failure restoration using checkpoint schemes

(1)

Modeling Mobility Database Failure Restoration

using Checkpoint Schemes

Sok-Ian Sou, Student Member, IEEE, and Yi-Bing Lin, Fellow, IEEE

Abstract— This paper studies checkpointing and failure

restoration of mobility database for Universal Mobile Telecom-munications System (UMTS). By utilizing per-user checkpointing technique, individual Home Location Register (HLR) records are saved into non-volatile backup storage form time to time. When a failure occurs, the backup record is restored back to the mobility database. We consider three per-user checkpoint schemes for the HLR. An analytic model is developed to investigate these schemes in terms of the probability that a HLR backup record is obsolete and the expected checkpoint interval. This model is validated against simulation experiments. Our study provides guidelines for selecting an appropriate checkpoint scheme and parameters for various traffic conditions.

Index Terms— Checkpoint, failure restoration, Home Location

Register (HLR), Universal Mobile Telecommunications System (UMTS).

I. INTRODUCTION

T

HIS paper studies checkpointing and failure restoration of mobility databases for Universal Mobile

Telecommu-nications System (UMTS) and General Packet Radio Service

(GPRS) [2], [9] that support wireless Internet applications. In these networks, the Home Location Register (HLR) is a database used for mobile user information management. All permanent subscriber data are stored in this database. An HLR record consists of three types of information: Mobile

Station (MS) Information, Service Information and Location Information. Both the MS and service information items are

only occasionally updated. The location information in the HLR is updated whenever the MS moves to a new Serving

GPRS Support Node (SGSN). To access the MS, the HLR

is queried to identify the current SGSN location of the MS. An MS may move frequently and the location information is often modified. If the HLR fails, the MS will not be accessed by the outside world. To guarantee service availability to the MSs, periodic checkpoint of HLR is mandatory as described in 3GPP TS 23.007 [1]. In this mechanism, the HLR location information is periodically saved into non-volatile backup storage. When an HLR failure occurs, the database recovery procedure is exercised. Several approaches were proposed to checkpoint the HLR database [7]. This paper studies three per-user checkpoint schemes for mobility database failure restoration. We first describe a commonly used checkpoint scheme. Then we discuss two improved checkpoint schemes.

Manuscript received March 22, 2005; revised August 16, 2005 and Novem-ber 26, 2005; accepted NovemNovem-ber 27, 2005. The associate editor coordinating the review of this paper and approving it for publication was W. Zhuang.

The authors are with the Department of Computer Science and Information Engineering, National Chiao Tung University, Hsinchu 30010, Taiwan, R.O.C. (e-mail:{sisou, liny}@csie.nctu.edu.tw).

Digital Object Identifier 10.1109/TWC.2007.05200.

An analytic model is developed to compare these three schemes in terms of expected checkpoint interval and the probability that a HLR backup record is obsolete. This analytic model is validated against simulation experiments.

II. PER-USERCHECKPOINTSCHEMES

This section describes three per-user checkpoint schemes. Scheme I is a basic scheme widely used in UMTS checkpoint-ing [3], [4], which is described as follows.

Scheme I. For every MS, this scheme defines a timeout

period tp. When this timer expires, a checkpoint is

performed to save the HLR record of the MS into the non-volatile backup storage.

Both Schemes II and III take delayed registration approach with the following intuition. Suppose that after a checkpoint is performed, no registration has occurred before the tp timer

expires. In this situation, there is no need to checkpoint the record because the backup copy is still valid. Details of Schemes II and III are described as follows.

Scheme II [7]. If no registration has occurred before the tp

timer expires, the backup record is the same as that in the HLR. Therefore, checkpoint is postponed until the next registration occurs. To implement Scheme II, a tri-state Finite State Machine (FSM) is implemented for every HLR record (see Fig. 1 (a)). State 0 represents that neither registration event nor tptimeout event occurs

after the previous checkpoint. State 1 represents that after the previous checkpoint, the first registration event occurs before a tp timeout event does. Note that the backup

record is obsolete in this state. State 2 represents that after the previous checkpoint, a tp timeout event occurs

before a registration event does. Initially, the FSM is in State 0, and the tp timer starts to decrement. If a

registration event occurs before the tp timer expires, the

FSM moves to State 1, and remains in State 1 until the

tp timeout event occurs. At this point, the HLR record is checkpointed into the backup, the tptimer is restarted,

and the FSM moves back to State 0. If the timeout event occurs at State 0, the FSM moves to State 2 and the tp

timer is stopped. If a registration event occurs at State 2, a checkpoint is performed, the tp timer is restarted, and

the FSM moves back to State 0.

Scheme III. Scheme III is the same as Scheme I except that

Scheme III will check whether a registration has occurred when the tp timer expires. If no registration occurs, the

backup record is valid, then Scheme III simply restarts the

tptimer without checkpointing the record. In this scheme, a two-state FSM is implemented for every HLR record

(2)

0

1

tp timeout tp timeout registration registration

0

1

tp timeout registration registration

2

tp timeout registration (a) Scheme II (b) Scheme III

Fig. 1. State diagrams for schemes II and III.

(see Fig. 1 (b)). State 0 represents that no registration occurs after the previous checkpoint. State 1 represents that a registration occurs after the previous checkpoint. Note that the backup record is obsolete in this state. Initially, the FSM is in State 0, and the tptimer starts to

decrement. If a registration event occurs, the FSM moves to State 1 and remains in State 1 until the tp timeout

event occurs. Then the FSM moves back to State 0, the HLR record is checkpointed into the backup, and the tp

timer is restarted. If the timeout event occurs at State 0, it means that no registration event has occurred after the previous checkpoint. The tp timer is restarted and the

FSM keeps staying in State 0 without performing any checkpoint.

In the subsequent sections, we investigate the performance of Schemes I, II and III by considering two output measures, the probability of obsolete HLR record α and the expected checkpoint interval E[tc].

III. ANALYTICMODEL FOREXPONENTIAL_tpTIMER

In a typical checkpoint approach, fixed tp is selected.

Since we consider per-user checkpointing, fixed tp may result

in a large number of simultaneous checkpoints that cause congestion. In such case, one may select exponential timeout

tp with mean1/λ and the density function

fp(tp) = λe−λtp (1)

Exponential checkpoint intervals are widely used to resolve access backoff problems [10]. As we shall see in Section V, fixed tpyields better performance than exponential tpin some

cases, and exponential tp outperforms fixed tp in other cases.

In this section, we consider exponential tp for Schemes I, II

and III. In the next section, we consider analytic analysis with fixed tp for Schemes I and III.

Let random variable tm denote the inter-registration

time, which has an arbitrary density function fm(tm),

the mean 1/μm and the Laplace transform fm∗(s) =

_∞

tm=0fm(tm)e

−stm_dtm_{. Two output measures are considered}

for checkpoint performance:

• α: the probability that the HLR record in the backup

is obsolete when a failure occurs. The smaller the α

represents a registration. = represents a timeout ckpt ( ) represents a checkpointing. t0 = State ckpt ckpt ckpt ckpt ckpt 0 1 0 2 0 = = 0 1 0 2 * p t t1 tm tp tc tm t_c tr tc tp tm tp tp tc t2 t3 t4 t5 t6 t7 t8

Fig. 2. Timing diagram for scheme II.

value, the better the checkpoint performance. We use αi

to represent the α value for Scheme i (i = I, II or III).

• E[tc]: the expected checkpoint interval. The larger the E[tc] value, the lower the checkpoint overhead. We use Ei[tc] to represent the E[tc] value for Scheme i (i = I, II

or III).

The analytic models developed in this and the next sections are validated against simulation experiments. The details will be given in Section V.

A. Modeling of Schemes I and II

In [7], an analytic model was proposed for Schemes I and II. For Scheme I, αI and EI[tc] were derived as

αI = _μm λ [1 − f∗ m(λ)] and EI[tc] = 1 λ (2)

In the analytic analysis of Scheme II in [7], the arrival of a registration event is considered as a random observer of the inter-checkpoint arrival times. This random observer assump-tion is accurate if the inter-registraassump-tion times are exponentially distributed. However, when the inter-registration times have an arbitrary distribution with large variance, the random observer assumption may cause inaccuracy in deriving αII and EII[tc].

In this paper, a new approach is proposed to accurately compute αII and EII[tc].

To describe the notation used in our derivation, consider the timing diagram in Fig. 2. At time t0, the FSM is at State 0 (when a tp timeout occurs). At time t1, the next registration

occurs and the FSM moves from State 0 to State 1. At time

t2, the tp timer expires and the FSM moves from State 1

to State 0. The checkpoint interval tc = t2− t0 is equal to tp. Let t∗p = t2− t1. From the memoryless property, t∗p has

the same distribution as tp. At time t3, the tp timer expires

again and the FSM moves from State 0 to State 2. At time

t4, a registration occurs (where tm = t4− t1 is the

inter-registration interval). The FSM moves from State 2 to State

0, and tc = t4− t2 = tr. In [7], tr was modeled as the

residual time of tm. Since the checkpoint time t2may not be

a random observer of tmperiods, inaccuracy may be incurred.

The exact density function fr(tr) for tris derived as follows.

As shown in Fig. 2, tr = t4− t2 can be written as tr =

(t4− t1) − (t2− t1) = tm− t∗p where tm> t∗p. That is, trcan

be represented by the difference of two random variables tm

(3)

Fig. 3. Modified state diagram for scheme II.

function fr(tr) is computed as a conditional probability fr(t)dt = Pr[t < tm− t ∗ p< t + dt and tm> t∗p] Pr[tm> t∗p] = _∞ t∗p=0fm(t + t ∗ p)dtfp(t∗p)dt∗p _∞ t∗ p=0 _∞ tm=t∗pfm(tm)dtmfp(t ∗ p)dt∗p = ∞ t∗p=0fm(t + t ∗ p)e−λt ∗ p_dt∗ pdt _∞ t∗p=0 _∞ tm=t∗pfm(tm)dtme −λt∗ p_dt∗ p (3) When tm has the η-Erlang distribution with the shape

para-meter η and the scale parapara-meter ημm(i.e., the mean is1/μm),

it has the density function

fm(tm) = (ημmtm)η−1 (η − 1)! ημme−ημmtm ₍₄₎ From (4), (3) is re-written as fr(tr) = ⎡ ⎣η−1 j=0 ημm ημm+ λ j⎤ ⎦ −1 (ημm)ηe−ημmtr (η − 1)! × η−1 k=0 η − 1 k tη−1−kr k! 1 ημm+ λ k (5) When tm is exponentially distributed (i.e., η = 1), (5) is

simplified as fr(tr) = μme−μmtr. This result is consistent

with the memoryless property for exponential tmdistribution.

In Scheme II, if the tptimer is restarted due to the tp timeout

event (i.e., a transition from State 1 to State 0; see t2in Fig. 2), then the next checkpoint interval is tc =max(tp, tr). On

the other hand, if the tp timer is restarted due to a registration

event (i.e., a transition from State 2 to State 0; see t6in Fig. 2), then the next checkpoint interval is tc =max(tm, tp). To

distinguish the above two cases, State 0 in Fig. 1 (a) is split into State 01 and State 02.

Fig. 3 draws the modified state diagram for Scheme II. If a checkpoint occurs due to a tp timeout event, then the FSM

moves from State 1 to State 01. If a checkpoint occurs due to a registration event, then the FSM moves from State 2 to

State 02. Detailed derivation for EII[tc] was given in [7]. For

the reader’s benefit, we summarize the derivation here. Let πx

be the stationary probability that the FSM is in State x. Let p1 and p2represent the conditional probabilities that if the FSM is in State 0, it is in State 01 and State 02, respectively. That is,

p1= π01

π01+ π02 and p2=

π02

π01+ π02 (6)

In Fig. 3, the transition probabilities from State 02 to State

1 and State 2 are pa and pb, respectively. The transition

probabilities from State 01 to State 1 and State 2 are pcand pd, respectively. In [7], the above probabilities were derived as

pa = fm∗(λ) , pb= 1 − fm∗(λ) pc = fr∗(λ) , pd= 1 − fr∗(λ)

(7) where fr∗(λ) can be derived from (3) using the definition of

Laplace Transform. From (7), we can derive π01 and π02 as

π01= f ∗ m(λ) 2[1+f∗ m(λ)−fr∗(λ)] π02= 1−f ∗ r(λ) 2[1+f∗ m(λ)−fr∗(λ)] (8) and then from (6) and (8) we have

p1= f ∗ m(λ) 1 + f∗ m(λ) − fr∗(λ) , p2= 1 − f ∗ r(λ) 1 + f∗ m(λ) − fr∗(λ) (9) From (1), (3) and (9), the expected checkpoint interval EII[tc]

is derived as

EII[tc] = p1E[max(tp, tr)] + p2E[max(tm, tp)]

= p1 fr∗(λ) λ + E[tr] + p2 fm∗(λ) λ + 1 μm (10) where E[tr] = _∞ tr=0trfr(tr)dtr.

Accurate αII probability is derived in this paper as follows.

We first note that the HLR backup record is obsolete only when the FSM is in State 1. Since a failure is considered as a random observer of the checkpoint intervals, probability αII

can be computed as the proportion of time that the FSM is in

State 1 in the checkpoint interval. In Fig. 2, the FSM is in State 0 after a checkpoint. There are two possibilities for the

first event occurring in the checkpoint interval.

Case I. After the previous checkpoint, the first event is a

registration event (with probability p1pc+ p2pa). In this case, the FSM moves to State 1 (see t1and t7in Fig. 2). The FSM stays in State 1 until the tp timer expires

(see t2 and t8in Fig. 2). From the memoryless property, the periods that the FSM stay in State 1 (i.e., t2− t1 and t8− t7 in Fig. 2) have the same distribution as tp.

Therefore, the expected value that the FSM stays in State

1 is1/λ.

Case II. After the previous checkpoint, the first event is a tp

timeout event (with probability p1pd+p2pb). In this case, the FSM moves to State 2 (see t3and t5 in Fig. 2). The FSM stays in State 2 until a registration event occurs (see t4 and t6 in Fig. 2). Since the FSM does not visit

State 1, the expected value that the FSM stays in State 1 is 0.

Therefore, αII can be expressed as

αII = E[time spent in State 1 in the checkpoint interval] E[checkpoint interval] = (p1pc+ p2pa)(1/λ) + (p1pd+ p2pb) × 0 EII[tc] (11) = (p1pc+ p2pa)(1/λ) EII[tc] (12)

(4)

In (11), the first term represents the situation when Case I holds and the second term represents the situation when Case II holds. From (7) and (9), (12) is re-written as

αII = fm∗(λ)fr∗(λ) 1 + f∗ m(λ) − fr∗(λ) + [1 − fr∗(λ)]fm∗(λ) 1 + f∗ m(λ) − fr∗(λ) × 1 λEII[tc] = fm∗(λ) 1 + f∗ m(λ) − fr∗(λ) 1 λEII[tc] (13)

B. Modeling of Scheme III

This subsection derives the output measures of Scheme III. Based on the description for Scheme III in Section II, it is clear that the α performance in both Schemes I and III are the same. From (2), we have

αIII= _μm λ [1 − f∗ m(λ)] (14)

The output measure EIII[tc] is derived as follows. The

probability that the backup record is obsolete when a tp

timeout event occurs is equal to αIII. When the tp timer

expires, the checkpoint is not performed with probability 1 − αIII. Therefore, the number of tp timeouts occurring in

an checkpoint interval has a geometric distribution. From (1) and (14), EIII[tc] can be expressed as

EIII[tc] = _∞ n=1 n(1 − αIII)n−1αIII E[tp] = 1 μm[1 − fm∗(λ)] (15) IV. ANALYTICMODEL FORSCHEMESIANDIIIWITH

FIXED_tpTIMER

In this section, we present the analytic model for fixed tp

timer with value 1/λ in Schemes I and III. It is clear that

EI[tc] = 1/λ, which is the same as that for the exponential tp case. Consider the timing diagram in Fig. 4 (a). Let τp

be the interval between the previous tp timeout and when the

failure occurs. Since the failure can be considered as a random observer of the tp intervals, τp is uniformly distributed over

[0, 1/λ]. Let τa be the interval between when the previous

registration occurs and when the failure occurs; in other words,

τa is the reverse residual time of tm. According to the reverse

residual time theorem [6], τa has the same distribution as the

residual time of tm. That is, τa has the density function rm(τa) = 1 −τa t=0fm(t)dt E[tm] = μm 1 − _τ_a t=0 fm(t)dt (16) In Fig. 4 (a), the HLR backup record is obsolete if τp> τa.

Therefore, the probability that the HLR backup is obsolete when a failure occurs in Schemes I and III is derived as

αI = αIII = Pr[τp> τa] = λ

_1/λ

τp=0

_τ_p

τa=0

rm(τa)dτadτp (17)

Substitute (16) into (17), we have

αI = αIII = λμm _1/λ τp=0 _τ_p τa=0 1 − _τ_a t=0 fm(t)dt dτadτp (18) Previous tp timeout A failure occurs p a Previous registration λ 1 = p t Next tp timeout Previous tp timeout A random observer arrives p* c Previous registration λ 1 = p t Next tp timeout Next registration tm

(a) Timing diagram for deriving I and III

(b) Timing diagram for deriving

Fig. 4. Timing diagram for schemes I and III (fixedtp).

To compute EIII[tc], we first derive the probability β that

a checkpoint is not performed when the next tptimer expires.

Consider the timing diagram in Fig. 4 (b), where a random observer falls into two consecutive tp timeouts. Let τp∗ be

the interval between the previous tp timeout and when the

random observer arrives. Then τp∗is uniformly distributed over

[0, 1/λ]. Let tmbe the inter-registration interval and τc be the

interval between when the previous registration occurs and when the random observer arrives. It is clear that τc has the

same distribution as τa. If τc > τp∗, there is no registration

occurs between the previous tp timeout and when the random

observer arrives. Note that tm> τc in this case. On the other

hand, if tm> _λ1 + τc− τp∗, there is no registration occurs in

period between when the random observer arrives and when the next tp timeout occurs. Therefore, the probability β that

there is no registration occurs between two consecutive tp

timeouts can be derived as

β = Pr[τc > τp∗ and tm> ( 1 λ+ τc− τ ∗ p)|tm> τc] = λ _1/λ τp∗=0 _∞ τc=τp∗ rm(τc) × _∞ tm=1_λ+τc−τp∗ fm(tm) [1 −τc t=0fm(t)dt] dtmdτcdτp∗(19)

Substitute (16) into (19) to yield

β = λμm _1/λ τp∗=0 _∞ τc=τp∗ _∞ tm=1/λ+τc−τp∗ fm(tm)dtmdτcdτp∗ (20)

When the tptimer expires, the checkpoint is not performed

with probability β. Therefore, the number of tp timeouts

(5)

The expected checkpoint interval EIII[tc] can be derived as EIII[tc] = _∞ n=1 nβn−1(1 − β) E[tp] = _{(1 − β)λ}1 (21) where β can be obtained from (20).

For the illustration purposes, we assume that tm has a

2-Erlang distribution with mean1/μmand Laplace Transform fm∗(s) = 2μm 2μm+ s 2 (22) A more general tmdistribution will be considered by

simula-tion experiments later. From (5), fr(tr) is expressed as fr(tr) = 2μm+ λ 4μm+ λ tr+_2μ 1 m+ λ 4μm2e−2μmtr (23)

From (23) and the definition of the Laplace Transform, we have

fr∗(λ) =

8μm2

(4μm+ λ)(2μm+ λ)

(24) For Scheme I, EI[tc] is expressed in (1), which is not affected

by the tmdistribution. From (2), (14) and (22), the obsolete

HLR backup probabilities for Schemes I and III are

αI = αIII= μm_(2μ(4μm+ λ) m+ λ)2

(25) From (9), (10), (22) and (24), the expected checkpoint interval for Scheme II is EII[tc] = _(4μp1 m+ λ) 8μm2 λ(2μm+ λ) + λ μm+ 3 +(1 − p1) 1 λ 2μm 2μm+ λ 2 + 1 μm (26) where from (9), (22) and (24),

p1= 4μm

2_(4μ

m+ λ)

(2μm+ λ)2(4μm+ λ) − 4μm2λ

(27) From (13), (22) ,(24) and (26), the obsolete HLR backup probability for Scheme II is

αII = 4μm2(4μm+ λ) (2μm+ λ)2(4μm+ λ) − 4μm2λ 1 λEII[tc] (28) From (15) and (22), the checkpoint interval for Scheme III is

EIII[tc] = (2μm+ λ)

2

μmλ(4μm+ λ) (29)

Equations (25)-(29) have been validated against the simulation experiments. The simulation model follows the discrete event approach in [8], and the details are omitted.

V. NUMERICALEXAMPLES

This section uses numerical examples to investigate the performance of the three checkpoint schemes. Based on the analytic models proposed in Sections III and IV, we compute the output measures α and E[tc] of the three checkpoint

schemes.

Effects of Schemes I, II and III with exponential tp timer.

Fig. 5 plots the output measures α and E[tc] against λ.

½ ¼ ½ Æ ! ! ! ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº Æ Æ Æ Æ Æ Æ Æ Æ Æ ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººº ººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ½ ¼ ½ Æ !" !" !" ºººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº Æ Æ Æ Æ Æ Æ Æ ÆÆ ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ºººººººººººººººººº ººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº

Fig. 5. Effects ofλ on α and E[tc] (exponential tp).

In this figure, the dashed curves are for analytic results. The ◦, • and × points represent simulation data. This figure shows that the analytic and simulation models are consistent (the errors are within 0.3%). Fig. 5 shows the trivial phenomenon that both α and E[tc] decrease

when λ increases. The non-trivial result is that Fig. 5 quantitatively indicates how to choose λ value to ensure that the α value is under a predefined threshold set by the mobile operator. For example, if the mobile operator requires that the α value be less than 10%, then we should choose λ = 2.3μm in Scheme II and λ = 10μm

in both Schemes I and III (see the horizonal line in the

α part of Fig. 5). We also observe that both Schemes

II and III are better than Scheme I in terms of the α and the E[tc] performance. Scheme II is always better

than Scheme III in terms of α, and Scheme III is always better than Scheme II in terms of E[tc].

Effects of exponential and fixed tp. We consider both

ex-ponential and fixed tp timers with the same means1/λ.

If tm has a 2-Erlang distribution, then from (18), the

probability of obsolete HLR backup record with fixed tp

timer is computed as αI = αIII = λμm 1 λ τp=0 τp τa=0 (e−2μmτa_{+ 2μ}_mτae−2μmτa_)dτ_adτp = 1 − 3λ 4μm + 1 2+ 3λ 4μm e−2μm/λ ₍₃₀₎

From (20) and (21), the expected checkpoint interval with fixed tp timer for Scheme III is computed as

EIII[tc] = λ 1 − 1 μm + 1 λ e−2μm/λ −1 (31) Both (30) and (31) are validated against the discrete event simulation model where the errors between the simulation data and analytic data are within 0.2%. Therefore, our models for fixed tp are correct.

We investigate the effects of exponential and fixed tp

timers by two indicators. Let indicator Iα represent the

(6)

TABLE I

COMPARISON OFEXPONENTIALt_pTIMER ANDFIXEDt_pTIMER ONα

Scheme I Scheme II Scheme III

λ Exponential Fixed Iα Exponential Fixed Iα Exponential Fixed Iα

(Unit:μ_m) t_p(%) t_p(%) t_p(%) t_p(%) t_p(%) t_p(%) 0.1 92.9700 92.5000 0.51% 92.3000 92.4966 -0.21% 92.9700 92.5000 0.51% 0.25 83.9500 81.2500 3.22% 80.4600 81.2412 -0.97% 83.9500 81.2500 3.22% 0.5 72.0000 64.1100 10.96% 61.9400 61.9338 0.01% 72.0000 64.0900 10.99% 0.75 62.8100 51.1200 18.61% 46.6900 43.2917 7.28% 62.8100 51.0900 18.66% 1 55.5600 41.8800 24.62% 35.0900 28.5689 18.58% 55.5600 41.9000 24.59% 2.5 32.1000 19.2100 40.16% 8.1200 3.2954 59.42% 32.1000 19.2300 40.09% 5 18.3700 9.9200 46.00% 1.7000 0.4640 72.71% 18.3700 9.8700 46.27% 7.5 12.7400 6.6300 47.96% 0.6100 0.1431 76.54% 12.7400 6.6400 47.88% 10 9.7200 5.0100 48.46% 0.2800 0.0609 78.25% 9.7200 4.9800 48.77% TABLE II

COMPARISON OFEXPONENTIALtpTIMER ANDFIXEDtpTIMER ONE[tc]

Scheme I Scheme II Scheme III

(Unit:1/μ_m) (Unit:1/μ_m) (Unit:1/μ_m)

λ Exponential Fixed Ic Exponential Fixed Ic Exponential Fixed Ic

(Unit:μ_m) t_p t_p t_p t_p t_p t_p 0.1 10.0000 10.0000 0.00% 10.0493 10.0000 0.49% 10.7561 10.0000 7.03% 0.25 4.0000 4.0000 0.00% 4.1196 4.0009 2.88% 4.7647 4.0070 15.90% 0.5 2.0000 2.0000 0.00% 2.2249 2.0344 8.56% 2.7778 2.1165 23.81% 0.75 1.3333 1.3333 0.00% 1.6480 1.4501 12.01% 2.1228 1.5924 24.99% 1 1.0000 1.0000 0.00% 1.3902 1.2179 12.39% 1.8000 1.3712 23.82% 2.5 0.4000 0.4000 0.00% 1.0526 1.0111 3.95% 1.2462 1.0781 13.49% 5 0.2000 0.2000 0.00% 1.0076 0.9996 0.80% 1.0889 1.0234 6.02% 7.5 0.1333 0.1333 0.00% 1.0021 1.0004 0.17% 1.0464 1.0097 3.51% 10 0.1000 0.1000 0.00% 1.0008 0.9994 0.14% 1.0286 1.0067 2.13% exponential tp; that is

Iα= α(exponential) − α(fixed)

α(exponential) (32)

The larger the Iα, the better the fixed tp as compared

with the exponential tp. Another indicator Ic represents

the percentage of checkpoint interval reduced by fixed tp

over exponential tp. That is, Ic is defined as Ic = E[tc](exponential) − E[tc](fixed)

E[tc](exponential) (33)

The larger the Ic, the better the exponential tp as

com-pared with the fixed tp.

Tables I and II list the output measures α and E[tc]

against λ for both exponential tptimer and fixed tptimer.

Table I shows that Iα> 0 in most cases; that is, the fixed tp timer often outperforms the exponential tp timer in

terms of α. For example, when λ = μm, fixed tp timer

outperforms exponential tptimer by24.62% in Scheme I,

18.58% in Scheme II and 24.59% in Scheme III. On the other hand, Table II shows that Ic > 0 in most

cases; that is, the exponential tptimer often outperforms

the fixed tp timer in term of E[tc]. For example, when λ = μm, exponential tptimer outperforms fixed tp timer

by12.39% in Scheme II and 23.82% in Scheme III. Note that the expected checkpoint interval E[tc] for Scheme I

is not affected by the tp distribution.

Effects of variance on tm. We use the discrete event

simu-lation model to investigate the performance of checkpoint schemes with Gamma distributions. The 2-Erlang t_m distribution is a special case of the Gamma distribution. It has been shown that the distribution of any positive random variable can be approximated by a mixture of Gamma distributions (see Lemma 3.9 in [6]). One may also measure time periods in a real mobile network, and the measured data can be approximated by a Gamma distribution. It suffices to use the Gamma distribution with different shape and scale parameters to represent different tm distributions. For the exponential tp timer,

Fig. 6 plots the output measures α and E[tc] against the

variance Vmof the Gamma inter-registration times. When Vmis very large (i.e. Vm= 102/μm2), both Schemes II

and III have similar α performance. In this case, Scheme III outperforms Scheme II in term of E[tc]. On the other

hand, when Vm is very small (i.e., Vm = 10−2/μm2),

both Schemes II and III have similar E[tc] performance.

In this case, Scheme II outperforms Scheme III in term of α. Therefore, it is appropriate to select Scheme II when the variance Vmof tmis large. On the other hand,

(7)

¾ ½ ¼ ½ ¾ ¾ Æ º ºººººººººº ººººººººººººººººººº ºººººººººººººººººººººº º ºººººººººººº º º ºº º º ºººº ººº º ºº º º ºººº ºººº º º º ºººººº º º ºº º º ºººººº º º ººº º ºººººº º º ºººº º º º ºººº º º ºº º º º º º º º ººº º º º º º º ºº º º º º º ºº º º º º º º º ººº º º º º ºº ºº º º º º º ºº º º º º º º º ººº º º º º ººº º º º º º ººº º º º º º º ºº º º º º º ºº º º º º º º º ººº º º º º ºº º ºººº º º ºº º º º º ºººº º º ºº º º º º º º º º º º ºº º º º º ºº º º º ººº º º º ºº º º º º ººº º º º ºº º º º ºººº º º ºº º º º º º º º º º º ºº º º º ºº º º º º ººº º º º ºº º º º º ººº º º º º Æ Æ Æ Æ Æ º ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº º º ºººº ººººººº º ºººººººººº ºººººººººººº ººººº ººººººººººº ººººººººººººº º º ººººººººººººººººº º ººººººººººººººººººº º ººººººººººººº º º ºººººººººº º º ººº ºººººººº ººººººººº º º ºº º º º ºººººººººº º º ºººº º ºººººººººººººººººº º º ººººººººººººººººººººººººººººººººººº º ºº º º ººººººººººººººººººººººººººººººººººº º ºººººººººººººººººººº º º ººº º ºººººº º º ººº º ºººººº º º ººº º ºººººº º º ºº º º ºººº ººººººº º º ººº º ºººº ºººººº º º ºººº ºººººº º º ººº º ºººººº º º ººº º ºººººº º º ºº º º ºººº ºººººº º º º ººº º ºººº ººººº º º º ºº º º º º º ºº º º º º º º ºººº º º º º º ºº º º º º º º º ºº º º º º º º ºººº º º º º ººº º º º º º ºº º º º º º º ºº º º º º º º ºº º º º º º º º ººº º º º º º ºº º º º º º º ºº º º º º º º ºººº º º º º ººº º º º º º º ºº º º º º º º ºº º º º º º º ººº º º º º º ººº º º º º º ºº º º º º º º º ºººº º º º º ººº º º º º º ººº º º º º º ºº º º º º º º ºº º º º º º º ºººº º º º º º ºº º º º º º º ºº º º º º º º ºº º º º º º º ºººº º º º º ººº º º º º º º ºº º º º º º º ºººº º º º º ºººº º º º º ºººº º º ºº º º º º º ººº º º º º ººº º º º ºº º º º º ººº º º º º ºººº º º ºº º º º º º ºº º º º º ºººº º º ºº º º º º º ºº º º º º ºººº º º º ºº º º º º ººº º º º º ººº º º º ºº º º º º ººº º º º º ººº º º º ºº º º º º º ºº º º º º ºººº º º ºº º º º º º ºº º º º º ºººº º º º ºº º º º º ººº º º º º ººº º º º ºº º º º º ººº º º º º ºººº º º ºº º º º º º ºº º º º º ºº Æ Æ Æ Æ Æ º ººººº º ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº º º ººººººººººººººº º º ººººººººººººººº º ºººººººººººººººº º ºººººººººººººººº º ººººººººººººººº º º ººººººººººººººº º ºººººººº º ººººº ºººººº º º ºººº ºººººº º º ºººº ºººººº º º ºººº ºººººº º º ºººº ºººººº º º ºººº ºººººº º º ººº º ºººººº º º ººº º ºººººº º º ººº º ºººººº º º ººº º ºººººº º º ººº º ºººº ººººººººººº º ººººººººº º ººººººººº º º ººººººº ºººººººººº ºººººººººº º ººººººººº º ººººººººº º º ººººººººº º ººººººººº º º ººººººº ºººººººººº ºº ¾ ½ ¼ ½ ¾ ¾ Æ ººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ºººººººººººººººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ºººººººº Æ Æ Æ Æ Æ ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ººººººººººººº ºººººº ºººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº Æ Æ Æ Æ Æ ººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººººº

Fig. 6. Effects ofVmonα and E[tc] (exponential tp).

VI. CONCLUSIONS

We studied checkpointing and failure restoration of Home Location Register (HLR) for UMTS and GPRS. By utilizing per-user checkpoint scheme, an HLR record is saved into a backup database from time to time. When a failure occurs, the backup record is restored to the HLR. We first described a commonly used checkpoint Scheme (referred to as Scheme I). Then we proposed two new checkpoint schemes called Schemes II and III, respectively. An analytic model is devel-oped to compare these three schemes. The output measures considered are the probability α of obsolete HLR backup record and the expected checkpoint interval E[tc]. The analytic

model was validated against simulation experiments. We make the following observations.

• Both Schemes II and III are better than Scheme I in terms of the α and the E[tc] performance. Scheme II is always

better than Scheme III in terms of α, and Scheme III is always better than Scheme II in terms of E[tc].

• The fixed tp timer often outperforms the exponential tp

timer in term of α. On the other hand, the exponential tp

timer outperforms the fixed tp timer in term of E[tc]. • When the variance Vm of the inter-registration times

is very large, both Schemes II and III have similar α performance, and Scheme III outperforms Scheme II in term of E[tc]. On the other hand, when Vmis very small,

both Schemes II and III have similar E[tc] performance,

and Scheme II outperforms Scheme III in term of α. Therefore, it is appropriate to select Scheme II when the variance Vm is large, and Scheme III should be selected

when Vmis small.

As a final remark, we note that failure restoration issues also exist for mobility databases in SGSN and Visitor Location Register. However, the failure restoration solutions for them

are very different from that for the HLR. The reader is referred to [5] for the details.

ACKNOWLEDGMENT

This work was sponsored in part by NSC Excellence project NSC 94-2752-E-009-005-PAE, NSC 94-2219-E-009-001, NSC 94-2213-E-009-104, NTP VoIP Project under grant number NSC 94-2219-E-009-002, IIS/Academia Sinica, and ITRI/NCTU Joint Research Center.

REFERENCES

[1] 3GPP, 3rd Generation Partnership Project, Technical Specification Group Core Network; Restoration procedures (Release 6), Technical Specifi-cation 3G TS 23.007 version 6.0.0 (2004-03).

[2] 3GPP, 3rd Generation Partnership Project; Technical Specification Group Services and Systems Aspects; General Packet Radio Service (GPRS); Service Description; Stage 2 (Release 6), Technical Specifica-tion 3G TS 23.060 version 6.4.0 (2004-03).

[3] Y. Fang, I. Chlamtac, and H. Fei, “Analytical results for optimal choice of location update interval for mobility database failure restoration in PCS networks,” IEEE Trans. Parallel Distrib. Syst., vol. 11, no. 6, pp. 615–624, 2000.

[4] Y. Fang, I. Chlamtac, and H. Fei, “Failure recovery of HLR mobility databases and parameter optimization for PCS networks,” J. Parallel

Distrib. Comput., vol. 60, pp. 431–450, 2000.

[5] Z. Haas and Y.-B. Lin, “On optimizing the location update costs in the presence of database failures,” ACM/Baltzer Wireless Networks Journal, vol. 4, pp. 5, pp. 419–426, 1998.

[6] F. P. Kelly, Reversibility and Stochastic Networks. John Wiley & Sons, 1979.

[7] Y.-B. Lin, “Per-user checkpointing for mobility database failure restora-tion,” IEEE Trans. Mobile Comput., vol. 4, no. 2, pp. 189–194, 2005. [8] Y.-B. Lin and Y.-K. Chen, “Reducing authentication signaling traffic in

third generation mobile network,” IEEE Trans. Wireless Commun., vol. 2, no. 3, pp. 493–504, 2003.

[9] Y.-B. Lin and I. Chlamtac, Wireless and Mobile Network Architectures. John Wiley & Sons, 2001.

[10] I. Mitrani, Modeling of Computer and Communication Systems. Cam-bridge University Press, 1987.

Sok-Ian Sou received the B.S.CSIE. and M.S.CSIE

degrees from National Chiao Tung University (NCTU), Taiwan, in 1997 and 2004, respectively. She is currently working toward the Ph.D. degree at NCTU. Her current research interests include personal communications services network, Voice over IP technology and performance modeling.

Yi-Bing Lin (M’96-SM’96-F’03) is Chair Professor

and Vice President of Research and Development, National Chiao Tung University. His current re-search interests include wireless communications and mobile computing. Dr. Lin has published over 190 journal articles and more than 200 conference papers. Lin is the co-author of the book Wireless and

Mobile Network Architecture (with Imrich

Chlam-tac; published by John Wiley & Sons). Lin is an IEEE Fellow, an ACM Fellow, an AAAS Fellow, and an IEE Fellow.