Connection failure detection mechanism of UMTS charging protocol

(1)

Connection Failure Detection Mechanism of

UMTS Charging Protocol

Hui-Nien Hung, Yi-Bing Lin, Fellow, IEEE, Nan-Fu Peng, and Sok-Ian Sou

Abstract— In Universal Mobile Telecommunications System

(UMTS), the extension of GPRS tunneling protocol called GTP’ is utilized to transfer the Charging Data Records (CDRs) from GPRS Support Nodes (GSNs) to Charging Gateways (CGs). To ensure that the mobile operator receives the charging informa-tion, availability for the GTP’ transmission is essential. One im-portant issue on GTP’ availability is connection failure detection. It is desirable to select appropriate parameter values to avoid false failure detections (e.g., temporary network congestions) while to detect the true failures quickly. We propose an analytic model to compute the false failure detection probability and the expected true failure detection time. Based on our study, the network operator can select the appropriate parameter values for various traffic conditions to reduce the probability of false failure detection and/or true failure detection time.

Index Terms— GPRS Tunneling Protocol extension (GTP’),

charging protocol, connection failure detection, Charging Data Record (CDR).

I. INTRODUCTION

U

NIVERSAL Mobile Telecommunications System (UMTS) [1], [7] supports high-speed Packet Switched (PS) data for accessing versatile multimedia services. The PS Core Network is an IP-based backbone network [8]. This core network consists of GPRS Support Nodes (GSNs) such as Serving GSNs (SGSNs) and Gateway GSNs (GGSNs). The

Charging Gateway (CG) collects the billing and charging

information from the GSNs. The GTP’ protocol [3] is utilized to transfer the Charging Data Records (CDRs) from GSNs to CGs. When a Mobile Station is receiving a UMTS PS service, the CDRs are generated based on the charging characteristics (data volume limit, duration limit and so on) of the subscription information for that service. A CG analyzes and possibly consolidates the CDRs from various GSNs, and passes the consolidated data to a billing system.

A CG maintains a GSN list. An entry in the list represents a GTP’ connection to a GSN. This entry consists of pointers to a CDR database and the sequence numbers of possibly dupli-cated packets. A GSN maintains a list of CGs in the priority

Manuscript received May 13, 2004; revised December 22, 2004; accepted March 27, 2005. The associate editor coordinating the review of this letter and approving it for publication was Z. Zhang. This work was sponsored in part by NSC Excellence project NSC93-2752-E-0090005-PAE, ITRI/NCTU Joint Research Center, and IIS/Academia Sinica. The work of H.-N. Hung was supported in part by the National Science Council of Taiwan under Grant NSC-92-2118-M-009-013.

H.-N. Hung and N.-F. Peng are with the Institute of Statistics, Na-tional Chiao Tung University, Hsinchu 30010, Taiwan, R.O.C. (e-mail: hhung@stat.nctu.edu.tw; nanfu@stat.nctu.edu.tw).

Y.-B. Lin and S.-I. Sou are with the Department of Computer Science and Information Engineering, National Chiao Tung University, Hsinchu 30010, Taiwan, R.O.C. (e-mail: liny@csie.nctu.edu.tw; sisou@csie.nctu.edu.tw).

Digital Object Identifier 10.1109/TWC.2006.05026.

order (typically ranges from 1 to 100). If a GSN unexpectedly loses its connection to the current CG, it may send the CDRs to the next CG in the priority list. An entry in the CG list describes parameters for GTP’ transmission. After sending a GTP’ request, a GSN may not receive a response from the CG due to network failure, network congestion or temporary node unavailability. In this case, 3GPP TS 29.060 [2] defines a mechanism for request retry, where the GSN will retransmit the message until either a response is received within a timeout period or the number of a retry threshold is reached. In the latter case, the GSN-CG communication link is considered disconnected. This paper studies the availability issues for GTP’. Specifically we propose an analytic model to investigate the GTP’ connection failure detection mechanism. Our study will provide guidelines for the mobile operators to select the parameters for GTP’ connection manipulation.

II. GTP’ FAILUREDETECTIONMECHANISM

This section describes the Path Failure Detection Algorithm (PFDA) that detects path failure between the GSN and the CG. In a GSN, an entry in the CG list represents a GTP’ connection to a CG. We describe the entry attributes related to PFDA as follow:

• The CG address attribute identifies the CG connected to the GSN.

• The Status attribute indicates if the connection is “active” or “inactive”.

• The Charging Packet Ack Wait Time (Tr) is the

maxi-mum elapsed time the GSN is allowed to wait for the acknowledgement of a charging packet; typical allowed values range from 1 millisecond to 65 seconds.

• The Maximum Number of Charging Packet Tries (L) is the number of attempts (including the first attempt and the retries) the GSN is allowed to send a charging packet; typical L range is 1 − 16. When L = 1, it means that there is no retry.

• The Maximum Number of Unsuccessful Deliveries (K) is the maximum number of consecutive failed deliveries that are attempted before the GSN considers a connection failure occurs. Note that a delivery is considered failed (or timed out if it has been attempted for L times without receiving any acknowledgement from the CG).

• The Unsuccessful Delivery Counter (NK) attribute

records the number of the consecutive failed delivery attempts.

• The Unacknowledged Buffer stores a copy of each GTP’ message that has been sent to the CG but has not been acknowledged. A record in the unacknowledged

(2)

buffer consists of an Expiry Timestamp te, the Charging

Packet Try Counter (NL) and an unacknowledged GTP’

message. The expiry timestamp teis equal to Trplus the

time when the GTP’ message was sent, which represents the expiry of the message. The counter NL counts the

number of the first attempt and retries that have been performed for this charging packet transmission. PFDA works as follows:

Step 1. After the connection setup procedure is complete,

both NL and NK are set to 0, and the Status is set to

“active”. At this point, the GSN can send GTP’ messages to the CG.

Step 2. When a GTP’ message is sent from the GSN to the

CG at time t, a copy of the message is stored in the unacknowledged buffer, where the expiry timestamp is set to te= t + Tr.

Step 3. If the GSN has received the acknowledgement from

the CG before te, both NL and NK are set to0. Step 4. If the GSN has not received the acknowledgement

from the CG before te, NL is incremented by 1. If NL= L, then the charging packet delivery is considered failed. NK is incremented by 1.

Step 5. If NK = K, then the GTP’ connection is considered

failed. The Status is set to “inactive”.

When Step 5 of PFDA is encountered, it is assumed that the path between the GSN and the CG is no longer available, and the GSN is switched to another CG. However, besides link failure, unacknowledged packet transfers may also be caused by temporary network congestion. In this case, it is not desirable to perform CG switching (which is a very expensive operation). A simple way to avoid this kind of “false” failure

detection is to set large values for parameters Tr, L and

K. On the other hand, large parameter values may result in

delayed detection of “true” failures. Therefore, it is important to select appropriate parameter values so that true failures can be quickly detected while false failures can be avoided. Based on the GTP’ mechanism described in this section, we derive the probability of false failure detection in Section III, and compute the expected detection time of true failure in Section IV.

III. PROBABILITY OFFALSEFAILUREDETECTION

Let random variable tf be the lifetime between when

the GTP’ connection is established and when a true failure occurs. During this period, undesirable false failures (tempo-rary network congestions) may be detected, and the GSN is unnecessarily switched to another CG. Let α be the probability that the PFDA detects a false failure (and therefore the GSN is switched to another CG before a true failure occurs). Suppose that tf has the density function ff(tf). Let the arrivals of

charging packets be a Poisson stream with rate λc. Note that

the charging packets delivered between a GSN and the CG are generated by all users in this GSN. Each CDR stream of an individual user may have an arbitrary distribution, but the net traffic of all users becomes a Poisson stream [11]. We observe that the charging packets forms a Poisson stream when there are more than 20 users. Let the Echo message arrivals be a deterministic stream with the fixed interval Te.

For any reasonable setting, an Echo message should not be issued before the previous one is acknowledged or timed out. Thus, in CG configuration, we set

Te≥ LTr (1)

Let random variable Nc(tf) be the number of charging packet

arrivals (excluding retries) during the lifetime tf of the GTP’

connection. Then Pr[Nc(tf) = n] = (λctf)n n! e−λctf ₍₂₎

Let random variable Ne(tf) denote the number of Echo

message arrivals (excluding retries) during tf. That is Ne(tf) = tf/Te (3)

Let N (tf) be the number of GTP’ messages (excluding retries)

that the GSN attempts to deliver to the CG during tf. That

is, N (tf) = Ne(tf) + Nc(tf). From (3), N(tf) = tf/Te + Nc(tf). Therefore, for a given tf , (2) can be re-written as

Pr[N(tf) = tf/Te + n] = (λctf)n n! e−λctf ₍₄₎

Let random variable tr be the round-trip transmission delay

(between the GSN and the CG) for a GTP’ message attempt. We assume that tr has a distribution Fr(tr) and the density

function fr(tr). From Step 4 of PFDA, a transmission is timed

out with probability Pr[tr ≥ Tr]. From Step 5 of PFDA, a

delivery is timed out (after it has been tried for L times) with probability p, where

p = (Pr[tr≥ Tr])L= [1 − Fr(Tr)]L (5)

The GTP’ connection is considered disconnected after K consecutive delivery timeouts where each of the deliveries fails for L attempts (see Step 5 of PFDA). Since the GTP’ path is connected during tf, a false failure is detected if

Step 5 of PFDA is executed when the j-th GTP’ message delivery is timed out, where j ≤ N (tf). Let θ(j) denote

the probability that such false failure is detected at the j-th delivery. Assume that the delivery results (i.e., a success or a failure) are independent. Based on the relationship between j and K, θ(j) is derived in three cases:

Case I. 0 ≤ j < K. It is clear that θ(j) = 0.

Case II. j = K. It is clear that θ(j) = pK_.

Case III. j > K. In this case, no false failure is detected

before the (j −K −1)-th delivery, the (j −K)-th delivery is a success, and the last K deliveries are timed out. Therefore, θ(j) = 1 −j−K−1 i=0 θ(i) (1 − p)pK_.

From (5) and the three cases described above, we have

θ(j) = ⎧ ⎪ ⎨ ⎪ ⎩ 0 ,0 ≤ j < K pK _{,j = K} 1 −j−K−1 i=0 θ(i) (1 − p)pK _{,j > K} (6)

For K = 1 and j ≥ 1, (6) is simplified as θ(j) = (1 − p)j−1_p.

In this case, θ(j) becomes a geometric distribution. Let ¯θ(j)

be the probability that no false failure is detected before (and including) the j-th GTP’ message delivery. Then

¯θ(j) = 1 − j

(3)

From (4) and (7), the probability α of false failure detection is α = 1 − _∞ tf=0 ∞ n=0 ¯θ(tf/Te + n) ×Pr[N(tf) = tf/Te + n]ff(tf)dtf (8)

The derivation for (8) can be extended by assuming that the lifetime tf has an exponential distribution with rate λf. The

exponential distribution is chosen because it has often been used in reliability and lifetime modeling [10]. We note that our result can be easily generalized for tf with mixed-Erlang

distribution with a tedious routine. Eq. (8) is re-written as

α = 1 − λf ∞ k=0 ∞ n=0 ¯θ(k + n) λnc (λc+ λf)n+1 × n j=0 e−(λc+λf)kTe[(λ c+ λf)Te]j j! ×kj− e−(λc+λf)Te(k + 1)j ₍₉₎

IV. EXPECTEDTRUEFAILUREDETECTIONTIME

This section derives the expected detection time of “true” failure. Consider the timing diagram in Fig. 1(a), where a failure occurs at time tf and is detected at time td. The

detection time for the failure is τd = td− tf. Let random

variable NK(t) represent the NK value at time t. If NK(tf) = K − n (for 0 < n ≤ K), then the GTP’ connection failure

is detected when n more GTP’ message deliveries are timed out. Consider a GTP’ message sent from the GSN to the CG. The GSN either receives an acknowledgement from the CG or the delivery (i.e., the L-th transmission for this message) is timed out at time t∗. This time t∗ is denoted as the departure

time of the GTP’ message delivery. For1 ≤ i ≤ n, let td,ibe

the departure time of the i-th failed GTP’ message delivery after tf. Note that td= td,n. In Fig. 1(b), the arrival times ta,i

(for1 ≤ i ≤ n) correspond to the GTP’ message deliveries with the departure times td,i in Fig. 1(a). It is apparent that ta,i = td,i− LTr. Note that these arrivals may occur before

or after tf. In Fig. 1(b), the first j’ deliveries arrive before tf.

If

ta,n> tf (10)

then the true failure detection time τd is

τd = td,n− tf = ta,n+ LTr− tf (11)

In this section, we compute the probability that NK(tf) = K − n (for 0 < n ≤ K). This probability is used to derive

E[τd|ta,n > tf]. Then E[τd] is computed from E[τd|ta,n >

tf] derived in the following subsections and E[τd|ta,n ≤ tf]

derived in [12].

A. Derivation for the NK(tf) distribution

We first compute Pr[NK(tf)=0]. Then we use this result

to derive Pr[NK(tf)=j] (for 1 ≤ j ≤ K − 1). It is clear that tf lies in two consecutive Echo message arrivals. Suppose

that these two Echo messages arrive at times t0 and t0+ Te,

respectively (see Fig. 2). Since tf is a random observer, it

is uniformly distributed over[t₀, t0+Te). Let random variable NK→∞(t) be the NK value at time t when K → ∞. In

interval[t₀, t0+Te), {NK→∞(t); t ∈ [t0, t0+Te)} is a

contin-uous time, discrete state stochastic process (the state space is 0, 1, 2, ...). There exists j such that for 1 ≤ i ≤ j the interval [t0, t0+ Te) consists of j alternative periods (xi, yi), where

NK→∞(t)

= 0 , for t in one of the xi periods

> 0 , for t in one of the yi periods

If NK→∞(t0) = 0, then x1=0. Similarly, if NK→∞(t0+Te)=0, then yj=0. Let X = j i=1xi and Y = j i=1yi. Then Pr[NK→∞(t) = 0] = E[X] E[X] + E[Y ] = E[X] Te (12) From (12),Pr[N_K→∞(t) = j] (for j > 0) is expressed as

Pr[NK→∞(t) = j] = (1 − p)pj−1(1 − E[X]/Te) (13)

In (13), the last GTP’ message arrival before t is timed out with probability(1 − E[X]/Te), and the probability that there

are exact j −1 delivery timeouts before this last GTP’ message delivery is(1−p)pj−1_{. Suppose that no false failure is detected}

before tf. Under this condition, NK(tf) ranges from 0 to K − 1. From (12) and (13), we have

Pr[NK(tf) = j] = _E[X] Te−pK−1(Te−E[X]) ,j = 0 (1−p)pj−1_(T e−E[X]) Te−pK−1(Te−E[X]) ,0< j < K (14) In (14), E[X] is derived as follows. Let tl (0 < tl≤ LTr)

be the delivery delay for a GTP’ message delivery (including retries). In Fig. 2, k > 0 departures occur in [t0, t0+Te),

where the i-th departure occurs at ti (for 1 ≤ i ≤ k). Let tk+1= t0+Te be the arrival time of the next Echo message.

According to (1), the departure of the previous Echo message must occur in(t0, t0+Te). Suppose that this departure is the

j-th departure where j ≤ k. By considering whej-ther j-the previous Echo message delivery fails or successes, we express E[X] as

E[X] = E[X|tl= LTr] Pr[tl= LTr]

+E[X|tl< LTr] Pr[tl< LTr] (15)

E[X|tl=LTr] is derived as follows. When tl=LTr, the

pre-vious Echo message delivery fails. That is, tj=t0+LTr and NK→∞(tj) =0. Let zi=ti+1 − ti for 0 ≤ i ≤ k. Since

the NK value is only changed at times when departures

occur, zi contributes to E[X|tl=LTr] if NK→∞(ti)=0. Let

C = Pr[NK→∞(t0) = 0]. For j ≤ k, we have E[X|tl= LTr] = (1 − p) ⎧ ⎨ ⎩E _j−1 i=0 zi + E ⎡ ⎣ k i=j+1 zi ⎤ ⎦ ⎫ ⎬ ⎭ +CE[z0] (16)

Since j−1_i=1zi= LTr− z0 and

k i=j+1zi = Te− LTr− zj, (16) is re-written as E[X|tl= LTr] = (1 − p)(Te− E[zj]) + (C + p − 1)E[z0] (17) In (17), C = Pr[NK→∞(t0) = 0] is derived in [12]. E[z0]

(4)

The failure is detected td =td,n 2-nd GTP' departure n-th GTP' departure A failure occurs 1-st GTP' departure Time tf td,1 td,2 3-rd GTP' departure td,3 n-th GTP' arrival LTr ta,n ... d

(a) Departures after a true failure

' , j a t 1 , a t ta,j'+1 ) (m τ The failure is detected td =td,n A failure occurs Time d τ ta,n ... LTr 0 τ ... f t ta,j ... n-th GTP'arrival 1-st GTP'

arrival j '-th GTP' arrival (j'+1)-th GTP' arrival (Echo message)j-th GTP' arrival

* 0 τ An Echo message arrival (if any) ... r f LT t − r LT

(b) Arrivals corresponding to the departures in (a) where ta,n> tf

Fig. 1. Timing Diagram for Detecting True Failure(n ≤ K)

Previous Echo

message arrival message arrivalNext Echo

j t Previous Echo message departure k t l e t T− e T

The charging packet departures

0 t 1 t l t i-th (i+1)-th ( j-1)-th i z 1-st

The charging packet departures

k-th ( j+1)-th 1 + j t 1 − j t i t ti+1 2 t 2-nd ... ... zj ... 0 z e k t T t₊₁₌₀Time₊

Fig. 2. Timing Diagram for Deriving E[X]

occurs before t0+ LTr, then z0 is exponentially distributed

under the condition that z0< LTr. That is E[z0|z0< LTr] Pr[z0< LTr] = 1 λc 1 − e−λcLTr_{− LT} re−λcLTr (18)

If the first charging packet departure occurs after t0+ LTr,

then z0= LTr. In this case

E[z0|z0= LTr] Pr[z0= LTr] = LTre−λcLTr (19)

Combining (18) and (19) to yield

E[z0] = 1

λc

1 − e−λcLTr ₍₂₀₎

Following similar derivation, E[zj] can be expressed as

E[zj] = 1

λc

1 − e−λc(Te−LTr) ₍₂₁₎

From (17), (20) and (21), we have

E[X|tl= LTr] Pr[tl= LTr] = p (1 − p)Te+ 1 λc (C + p − 1)1 − e−λcLTr −(1 − p)1 − e−λc(Te−LTr) ₍₂₂₎

E[X|tl < LTr] is derived as follows. When 0 < tl < LTr,

the previous Echo message delivery successes. That is, tj = t0+ tl< t0+ LTr and NK→∞(tj) = 0. Let zi(tl) be the zi

value for a specific tl< LTr. Then for tl< LTr,

E[X|tl] = (1 − p) ⎧ ⎨ ⎩E _j−1 i=1 zi(tl) + E ⎡ ⎣ k i=j+1 zi(tl) ⎤ ⎦ ⎫ ⎬ ⎭ +CE[z0(tl)] + E[zj(tl)] (23)

Following similar derivation for (22), for tl< LTr,

E[X|tl] = 1 λc (C + 2p − 1) − (C + p − 1)e−λctl −pe−λcTe_eλctl+ (1 − p)T e (24)

Suppose that tl has the density function fl(tl) and the

dis-tribution function Fl(tl). If the previous Echo message is

successfully delivered, the delivery delay is 0 < tl < LTr

with probability fl(tl)dtl. Therefore, E[X|tl< LTr] Pr[tl< LTr] =

_LT_r

tl=0

E[X|tl]fl(tl)dtl (25)

where E[X|tl] is expressed in (24), and fl(tl) is derived in

[12]. Then E[X] can be obtained from (15), (22) and (25). Finally, Pr[NK(tf) = j] can be computed by using (14) and

(15).

B. Derivation for E[τd]

For ta,n > tf, let m > 0 denote the number of failed

GTP’ message arrivals occurring after tf. Note that m is not

necessarily equal to K −NK(tf) because some GTP’ message

arrivals may occur before tf and are timed out after tf. Such

messages are denoted as cross messages (“cross” means that the delivery delay “crosses” the time point tf). Therefore,

the departures of cross messages are not accurately counted in NK(tf). Fortunately, we know that these departures must

occur by tf+ LTr, and therefore m = K − NK(tf + LTr). NK(tf+ LTr) can be derived from NK(tf) as follows. Let nc and ne denote the numbers of cross charging packets and

cross Echo messages, respectively (in Fig. 1(b); j= nc+ne).

It can be observed that

NK(tf+ LTr) = min{NK(tf) + nc+ ne, K} (26)

Note that when m = K − NK(tf + LTr) = 0, we have ta,n ≤ tf. In this special case, m = 0 and E[τd|m = 0] is

derived in [12]. Now assume that m > 0. Since the deliveries of charging packets can be modeled by the M/G/∞ system and

tf is a random observer of the system, nc can be represented

by a Poisson random variable with parameter ρ (see Chapter 2.4 in [9]), where

ρ = λc

_LT_r

tl=0

[1 − FL(tl)]dtl (27)

and the probability mass function of nc is given by

Pr[nc= i] = ρi i! e−ρ (28)

In Fig. 1(b), let ta,j (for nc+ ne< j) be the arrival time of

the first Echo message occurring after tf, and τ0= ta,j− tf.

(5)

1|τ0] be the probability that ne = 1 for a specific τ0. Then Pr[ne= 1|τ0] can be expressed as Pr[ne= 1|τ0] = 0 ,τ0≤ Te− LTr 1 − FL(Te− τ0) ,τ0> Te− LTr (29)

where FL(t) is derived in [12]. In (29), when τ0 ≤ Te− LTr, there is no undelivered Echo message before tf. When

τ0 > Te− LTr, an Echo message arrival occurs in period

[tf − LTr, tf). This Echo message delivery fails before tf

with probabilityPr[ne= 1|τ0] = 1 − FL(Te− τ0). From (28) and (29),Pr[nc+ ne= j|τ0] can be expressed as

Pr[nc+ ne= j|τ0] = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ e−ρ(1 − Pr[ne= 1|τ0]) ,j= 0 e−ρ ρj −1 (j_−1)! Pr[ne= 1|τ0] +ρj j! (1 − Pr[ne= 1|τ0]) ,j> 0 (30)

Therefore, for i ≤ j < K, Pr[NK(tf + LTr) = j|τ0] can be computed fromPr[NK(tf) = i] and (30) as

Pr[NK(tf+ LTr) = j|τ0] = j i=0 Pr[NK(tf) = i] Pr[nc+ ne= j − i|τ0] (31)

For m > 0, let τ (m) = ta,n − tf (see Fig. 1(b)). E[τ (m)]

is derived as follows. Let mc and medenote the numbers of

charging packet arrivals and Echo message arrivals occurring in period τ (m). That is, m = mc+ me= n − (nc+ ne) > 0.

We have

me= (τ(m) − τ0)/Te + 1 (32)

If τ0> τ (m), then me= 0. Let τebe the interval between tf

and the arrival time of the me-th Echo message after tf. By

convention, τe= 0 for me= 0. Let τcbe the interval between tf and the arrival time of the mc-th charging packet after tf. Then τ (m) =max{τc, τe}. Note that meis determined by

τ (m) and τ0(see (32)), and therefore τeand τcare dependent

of each other. Since the arrivals of charging packets are a Poisson stream, τc has the Erlang distribution with mean mc/λc and shape parameter mc. For m > 0, the distribution

function Fc(τc) of τc is Fc(τc) = 1 − mc−1 i=0 (λcτc)i i! e−λcτc ₍₃₃₎

For m > 0, let Fm(τ(m)) be the distribution function of τ (m). From (32) and (33), we have

Fm(τ(m)|τ0) = Fc(τ(m)|τ0) = 1 − m− (τ(m)−τ0)_Te −2 i=0 [λcτ (m)]i i! e−λcτ (m)₍₃₄₎

Note that Fm(τ(m)|τ0) is discontinuous at points τ(m) = τ0+ jTe, for j = 0, 1, ..., me− 1. From (34) we have

Pr[τ(m) = τ0+ jTe|τ0] = Fm(τ0+ jTe|τ0) − Fm(τ0+ jTe−|τ0) = [λc(τ0+ jTe)]m−j−1 (m − j − 1)! e−λc(τ0+jTe) ₍₃₅₎

(a) Effect of L (b) Effect of λf

Fig. 3. Effects of Tr, L and λf on α (λc= μ/18)

Eq. (35) says that the m-th GTP’ message arrival is the (j+1)-th Echo message, and (j+1)-there are m − j − 1 charging packets occurring in period τ (m), which has the Poisson distribution with parameter λc. For a given τ0 and m > 0, the expected

value of τ (m) is E[τ (m)|τ0] = _∞ τ (m)=0 [1 − Fm(τ(m)|τ0)]dτ(m) = 1 λc m−1 i=0 1 − e−λc[τ0+(m−i−1)Te] × ⎧ ⎨ ⎩ i j=0 {λc[τ0+ (m − i − 1)Te]}j j! ⎫ ⎬ ⎭ (36)

Since tf is a random observer of the inter-Echo arrival times, τ0 is uniformly distributed over (0, Te]. From (11), (31) and

(36), the expected value of E[τd] is expressed as

E[τd] = E[τd|m > 0] Pr[m > 0] + E[τd|m = 0] Pr[m = 0]

= 1 Te K m=1 _T_e τ0=0 (E[τ(m)|τ0] + LTr) × Pr[NK(tf+ LTr) = K − m|τ0]dτ0 +E[τd|m = 0] Pr[m = 0] (37)

where E[τd|m = 0] and Pr[m = 0] are derived in [12].

The analytic model developed in this paper is validated against the simulation experiments. The discrepancies between ana-lytic analysis (specifically, Eqs. (9) and (37)) and simulation are within 3% in most cases. The simulation technique used in this paper is similar to the one described in [6], and the details are omitted.

V. NUMERICALEXAMPLES

Based on the analytic model developed in the previous section, we show how K, L and Tr affect the probability α

of false failure detection and the expected time E[τd] of true

failure detection. We assume that the round-trip transmission delay tr between a GSN and a CG has a hyper-Erlang

(6)

the distribution function Fr(tr) = 1 − M i=1 βi ⎧ ⎨ ⎩ mi−1 j=0 (miμitr)j j! e−miμitr ⎫ ⎬ ⎭ (38) where M, m1, m2, ..., mM are nonnegative integers, μi > 0, βi > 0, and

_M

i=1βi = 1. The hyper-Erlang distribution is

selected because this distribution has been proven as a good approximation to many distributions as well as measured data [4], [5]. From (5) and (38) p = ⎧ ⎨ ⎩ M i=1 βi ⎧ ⎨ ⎩ mi−1 j=0 (miμiTr)j j! e−miμiTr ⎫ ⎬ ⎭ ⎫ ⎬ ⎭ L (39) In our study, the input parameters λc, λf, Tr and the output

measure E[τd] are normalized by the mean 1/μ of the

round-trip transmission delay. For purposes of demonstration, we consider tr with a 2-Erlang distribution and KL = 6. The

Echo message arrivals is a deterministic stream with fixed interval Te= 18/μ.

A. Effects of input parameters on α

Based on (9), Fig. 3(a) plots α against Tr and the (K, L)

pair, where λc= μ/18 and λf = 1 × 10−5μ. It is trivial that

α is a decreasing function of Tr. The non-trivial result is that

Fig. 3(a) quantitatively indicates how the Trvalue affects α.

When Tr< 2/μ, increases Trsignificantly reduces α. On the

other hand, when Tr > 2/μ, increasing Tr does not improve

the performance. Also, for small Tr, L = 1 outperforms other

L setups. Same effect is observed for other λc values. When

Tr is large, the L (and thus K) values have same impact on α.

Fig. 3(b) plots α as a function of Tr and λf, where K = 6,

L = 1 and λc= μ/18. This figure shows that α increases as

λf decreases. When λf decreases (i.e., the system reliability

improves but the transmission delay distribution remains the same as before), the GTP’ connection lifetime becomes longer. Therefore, the opportunity for false failure detection increases. For Tr = 1.6/μ, when the system reliability increases from λf = 1 × 10−5μ to λf = 1 × 10−6μ, α increases by 2.72

times. This effect becomes insignificant when Tris large (e.g., Tr> 2.2/μ).

Fig. 4(a) plots α as a function of Tr and λc, where K = 6,

L = 1 and λf = 1×10−5μ. This figure shows that α increases

as λc increases. When there are more GTP’ message arrivals,

it is more likely that false failure detection occurs. This effect is insignificant when Tr becomes large (e.g., Tr> 2/μ).

B. Effects of input parameters on E[τd]

Based on (37), Fig. 4(b) plots E[τd] as a function of Tr

and λc, where K = 6, L = 1. This figure shows that E[τd]

significantly increases as λc decreases.

Figs. 5(a) and 5(b) plot E[τd] as functions of Tr and the

(K, L) pair, where λc= μ and λc = μ/36, respectively. These

figures show that E[τd] is an increasing function of Tr and E[τd] is more sensitive to the change of Tr when L is large

than when L is small. When λc = μ, E[τd] is larger for

L = 6 than for L = 1. When λc= μ/36, the opposite results

(a) α (b) E [τd] (unit 1/μ )

Fig. 4. Effects of Tr and λc(K = 6, L = 1)

(a) λc= μ (b) λc = μ/ 36

Fig. 5. Effects of Tr and L on E[τd]

are observed. This phenomenon can be explained as follows. Without loss of generality, assume that ta,1≥ tf. Consider an

extreme case that λc is very large, and many GTP’ charging

packets arrive in a very short period(t, t+dt) where t≥ tf.

For L = 1(K = 6), ta,6 ≈ t and td,6≈ t+ Tr. Therefore,

the true failure detection time is td ≈ t+Tr. For L = 6(K =

1), we have ta,1 ≈ t, but the true failure detection time is td = td,1 ≈ t+ 6Tr. Therefore, E[τd] is larger for L = 6

than for L = 1 in Fig. 5(a).

On the other hand, when λc is small, the charging packets

rarely occur in a short period, and it is likely that ta,i+1−ta,i> Tr (for i > 0). For L = 1, the failure is detected at ta,6+ Tr.

For L = 6, the failure is detected at ta,1+ 6Tr. Under the

situation that ta,i+1− ta,i > Tr, we have ta,6− ta,1 > 5Tr.

Therefore, we expect that E[τd] is smaller for L = 6 than for L = 1 in Fig. 5(b).

VI. CONCLUSIONS

In UMTS, the GTP’ protocol is used to deliver the CDRs from GSNs to CGs. To ensure that the mobile operator receives the charging information, availability for the charging system is essential. One of the most important issues on GTP’ availability is connection failure detection. This paper studied the GTP’ connection failure detection mechanism specified in 3GPP TS 29.060 and 3GPP TS 32.215. The output measures

(7)

considered are the false failure detection probability α and the expected time E[τd] of true failure detection. We proposed an

analytic model to investigate how these two output measures are affected by input parameters including the Charging Packet Ack Wait Time Tr, the Maximum Number L of Charging

Packet Tries and the Maximum Number K of Unsuccessful Deliveries. We make the following observations.

• When Tr is small, increasing Trreduces α significantly.

When Tr is sufficiently large, increasing Tr only has

insignificant impact on α. On the other hand, increasing

Tr always non-negligibly increases E[τd].

• α increases as the charging packet arrival rate λc

in-creases. This effect is insignificant when Tr becomes

large. On the other hand, the effects of λc on E[τd] are

not the same for different(K, L) setups. In our examples, when λc is large, E[τd] is larger for L = 6 than for

L = 1. When λc is small, E[τd] is smaller for L = 6

than for L = 1. Therefore, the effects of λc should be

considered when we select the L value.

In summary, the network operator can select the appropriate

Tr, L and K values for various traffic conditions based on our

study.

REFERENCES

[1] 3rd Generation Partnership Project, Technical Specification Group Ser-vices and Systems Aspects, “Architectural Requirements for Release” 1999 (Release 1999), 3G TS 23.121 version 3.6.0 (2002-06), 2002. [2] 3rd Generation Partnership Project, Technical Specification Group Core

Network, General Packet Radio Service (GPRS), “GPRS Tunneling Protocol (GTP) across the Gn and Gp Interface” (Release 5), 3G TS 29.060 version 5.9.0 (2004-03), 2004.

[3] 3rd Generation Partnership Project, Technical Specification Group Ser-vices and Systems Aspects, Telecommunication management, Charging management, “Charging data description for the Packet Switched (PS) domain” (Release 5), 3G TS 32.215 version 5.5.0 (2003-12), 2003. [4] Y. Fang, and I. Chlamtac, “Teletraffic analysis and mobility modeling for

PCS networks,”IEEE Trans. Commun., vol. 47, no.7, pp. 1062-1072, July 1999.

[5] F. P. Kelly, Reversibility And Stochastic Networks. John Wiley & Sons, 1979.

[6] Y.-B. Lin, and Y.-K. Chen, “Reducing authentication signaling traffic in third generation mobile network,” IEEE Trans. Wireless Commun., vol.2, no.3, pp. 493-501, May 2003.

[7] Y.-B. Lin and I. Chlamtac, Wireless and Mobile Network Architectures. JohnWiley & Sons, 2001.

[8] Y.-B. Lin, Y.-R. Haung, A.-C. Pang, and I. Chlamtac, “All-IP approach for UMTS third generation mobile networks,” IEEE Network, vol. 16, no.5, pp. 8-19, Sept. 2002.

[9] R. G. Gallager, Discrete Stochastic Processes. Kluwer Academic Pub-lishers, 1999.

[10] S. M. Ross, A First Course in Probability. Prentice Hall, 2001. [11] S. M. Ross, Stochastic processes. JohnWiley & Sons, 1996.

[12] H.-N. Hung, Y.-B. Lin, N.-F. Peng, and S.-I. Sou, Connection Failure Detection Mechanism of UMTS Charging Protocol. Technical Report, 2004.

Hui-Nien Hung received the B.S.Math. degree from

National Taiwan University, Taiwan, in 1989, the M.S.Math. degree from National Tsin-Hua Univer-sity, Taiwan, in 1991, and the Ph.D. degree in Sta-tistics from The University of Chicago in 1996. He is a Professor at the Institute of Statistics, National Chiao Tung University, Taiwan. His current research interests include applied probability, financial cal-culus, bioinformatics, statistical inference, statistical computing and industrial statistics.

Yi-Bing Lin (M’95-SM’95-F’03) received the B.S.E.E. degree from the National Cheng Kung University in 1983 and the Ph.D. degree in computer science from the University of Washington in 1990. He is chair professor in the Department of Com-puter Science and Information Engineering (CSIE), National Chiao Tung University (NCTU). Dr. Lin is a fellow of the IEEE and the ACM.

Nan-Fu Peng received the B.S. degree in the

ap-plied mathematics from National Taiwan University, Hsinchu, Taiwan, R.O.C., in 1981, and the Ph.D. degree in statistics from The Ohio State University, Columbus, in 1989. He is currently an Associate Professor with the Institute of Statistics, National Chiao Tung University. His research interests in-clude Markov chains, population dynamics, and the queueing theory.

Sok-Ian Sou received the B.S.CSIE. and M.S.CSIE

degrees from National Chiao Tung University (NCTU), Taiwan, in 1997 and 2004, respectively. She is currently working toward the Ph.D. degree at NCTU. Her current research interests include personal communications services network, Voice over IP technology and performance modeling.