Advanced Computer Network

(1)

Advanced Computer Network

Prof. Ai-Chun Pang

Graduate Institute of Networking and Multimedia,

Department of Computer Science and Information Engineering,

National Taiwan University, Taiwan

(2)

Outline

• Preliminaries

• Poisson Processes

• Renewal Processes

• Discrete-Time Markov Chains (Optional)

(3)

Preliminaries

• Applied Probability and Performance Modeling – Prototype

– System Simulation – Probabilistic Model

• Introduction to Stochastic Processes – Random Variable (R.V.)

– Stochastic Process

• Probability and Expectations – Expectation

– Generating Functions for Discrete R.V.s

(4)

Preliminaries

• Probability Inequalities

– Markov’s Inequality (mean)

– Chebyshev’s Inequality (mean and variance)

– Chernoff’s Bound (moment generating function) – Jensen’s Inequality

• Limit Theorems

– Strong Law of Large Numbers – Weak Law of Large Numbers – Central Limit Theorem

(5)

Applied Probability and Performance Modeling

• Prototyping

– complex and expensive

– provides information on absolute performace measures but little on relative performance of different designs

• System Simulation

– large amount of execution time

– could provide both absolute and relative performance depending on the level of detail that is modeled

• Probabilistic Model

– mathematically intractable or unsolvable

(6)

A Single Server Queue

• Arrivals: Poisson process, renewal process, etc.

• Queue length: Markov process, semi-Markov process, etc.

• . . .

(7)

Random Variable

• A “random variable” is a real-valued function whose domain is a sample space.

• Example. Suppose that our experiment consists of tossing 3 fair coins. If we let ˜y denote the number of heads appearing, then ˜y is a random variable taking on one of the values 0, 1, 2, 3 with respective probabilities

P {˜y = 0} = P {(T, T, T )} = 1 8

P {˜y = 1} = P {(T, T, H), (T, H, T ), (H, T, T )} = 3 8 P {˜y = 2} = P {(T, H, H), (H, T, H), (H, H, T )} = 3

8 P {˜y = 3} = P {(H, H, H)} = 1

8

(8)

Random Variable

• A random variable ˜x is said to be “discrete” if it can take on only a finite number—or a countable infinity—of possible values x.

• A random variable ˜x is said to be “continuous” if there exists a nonnegative function f , defined for all real x ∈ (−∞, ∞), having the property that for any set B of real numbers

P {˜x ∈ B} = Z

B

f (x)dx

(9)

Stochastic Process

• A “stochastic process” X = {˜x(t), t ∈ T } is a collection of random variables.

That is, for each t ∈ T , ˜x(t) is a random variable.

• The index t is often interpreted as “time” and, as a result, we refer to ˜x(t) as the “state” of the process at time t.

• When the index set T of the process X is

– a countable set → X is a discrete-time process

– an interval of the real line → X is a continuous-time process

• When the state space S of the process X is

– a countable set → X has a discrete state space

– an interval of the real line → X has a continuous state space

(10)

Stochastic Process

• Four types of stochastic processes

– discrete time and discrete state space – continuous time and discrete state space – discrete time and continuous state space – continuous time and continuous state space

(11)

Discrete Time with Discrete State Space

(12)

Continuous Time with Discrete State Space

! "# #

(13)

Discrete Time with Continuous State Space

(14)

Continuous Time with Continuous State Space

(15)

Two Structural Properties of stochastic processes

a. Independent increment: if for all t₀ < t₁ < t₂ < . . . < t_n in the process X = {˜x(t), t ≥ 0}, random variables

˜

x(t₁) − ˜x(t₀), ˜x(t₂) − ˜x(t₁), . . . , ˜x(t_n) − ˜x(t_n−1) are independent,

⇒ the magnitudes of state change over non-overlapping time intervals are mutually independent

b. Stationary increment: if the random variable ˜x(t + s) − ˜x(t) has the same probability distribution for all t and any s > 0,

⇒ the probability distribution governing the magnitude of state change depends only on the difference in the lengths of the time indices and is independent of the time origin used for the indexing variable

⇓

X = {˜x₁, ˜x₂, ˜x₃, . . . , ˜x_∞}

limiting behavior of the stochastic process

(16)

Two Structural Properties of stochastic processes

• both independent and stationary increments,

• neither independent nor stationary increments,

• independent but not stationary increments, and

• stationary but not independent increments.

(17)

Expectations by Conditioning

Denote by E[˜x|˜y] that function of the random variable ˜y whose value at ˜y = y is E[˜x|˜y = y].

⇒ E[˜x] = E[E[˜x|˜y]]

If ˜y is a discrete random variable, then E[˜x] = X

y

E[˜x|˜y = y]P {˜y = y}

If ˜y is continuous with density f_y_˜(y), then E[˜x] =

Z _∞

−∞

E[˜x|˜y = y]f_y_˜(y)dy

(18)

Expectations by Complementary Distribution

For any non-negative random variable ˜x E[˜x] =

X∞ k=0

p(˜x > k) discrete

E[˜x] =

Z _∞

0

[1 − F_x_˜(x)]dx continuous

. . . .

(19)

Expectations by Complementary Distribution

Discrete case:

E[˜x] = 0 · P (˜x = 0) + 1 · P (˜x = 1) + 2 · P (˜x = 2) + . . . (horizontal sum)

= [1 − P (˜x < 1)] + [1 − P (˜x < 2)] + . . . (vertical sum)

= P (˜x ≥ 1) + P (˜x ≥ 2) + . . .

=

X∞ k=1

P (˜x ≥ k) (or

X∞ k=0

P (˜x > k))

~x

x~

x~ x≦

~

(20)

Expectations by Complementary Distribution

Continuous case:

E[˜x] =

Z _∞

0

x · f_x_˜(x)dx

=

Z _∞

0

µZ _x

0

dz

¶

· f_x_˜(x)dx

=

Z _∞

0

·Z _∞

z

f_x_˜(x)dx

¸

· dz

=

Z _∞

0

[1 − F_x_˜(z)]dz

(21)

Compound Random Variable

S˜_n_˜ = ˜x₁ + ˜x₂ + ˜x₃ + . . . + ˜x_n_˜, where ˜n ≥ 1 and

˜

x_i are i.i.d. random variables.

⇒ E[ ˜S_n_˜] =? V ar[ ˜S_n_˜] =?

. . . . E[ ˜S_n_˜] = E[E[ ˜S_n_˜|˜n]]

=

X∞ n=1

E[ ˜S_n_˜|˜n = n] · P (˜n = n)

=

X∞ n=1

E[˜x₁ + ˜x₂ + . . . + ˜x_n] · P (˜n = n)

=

X∞

n · E[˜x₁] · P (˜n = n)

(22)

Compound Random Variable

Since V ar[˜x] = E[V ar[˜x|˜y]] + V ar[E[˜x|˜y]], we have

V ar[ ˜S_n_˜] = E[V ar[ ˜S_n_˜|˜n]] + V ar[E[ ˜S_n_˜|˜n]]

= E[˜nV ar[˜x₁]] + V ar[˜nE[˜x₁]]

= V ar[˜x₁]E[˜n] + E²[˜x₁]V ar[˜n]

(23)

Probability Generating Functions for Discrete R.V.s

• Define the generating function or Z-transform for a sequence of numbers {a_n} as a^g(z) = P_∞

n=0 a_nzⁿ.

• Let ˜x denote a discrete random variable and a_n = P [˜x = n]. Then P_x_˜(z) = a^g(z) = P_∞

n=0 a_nzⁿ = E[z^x^˜] is called the probability generating function for the random variable ˜x.

• Define the kth derivative of P_x_˜(z) by

P_x_˜^(k)(z) = d^k

dz^k P_x_˜(z).

Then, we see that

P_x_˜⁽¹⁾(z) =

X∞ n=0

na_nzⁿ⁻¹ → P_x_˜⁽¹⁾(1) = E[˜x]

and

(24)

Probability Generating Functions for Discrete R.V.s

• See Table 1.1 [Kao] for the properties of generating functions.

• <Homework>. Derive the probability generating functions for “Binomial”,

“Poisson”, “Geometric” and “Negative Binomial” random variables. Then, derive the expected value and variance of each random variable via the

probability generating function.

(25)

Laplace Transforms for Continuous R.V.s

• Let f be any real-valued function defined on [0, ∞). The Laplace transform of f is defined as

F^∗(s) =

Z _∞

0

e^−stf (t)dt.

• When f is a probability density of a nonnegative continuous random variable

˜

x, we have

F_x_˜^∗(s) = E[e^−s˜^x]

• Define the nth derivative of the Laplace transform F_x_˜^∗(s) with respect to s by F_x_˜^∗(n)(s) = dⁿ

dsⁿ F_x_˜^∗(s) → F_x_˜^∗(n)(s) = (−1)ⁿE[˜xⁿe^−s˜^x].

Then, we see that

E[˜xⁿ] = (−1)ⁿF^∗(n)(0)

(26)

Laplace Transforms for Continuous R.V.s

• <Homework>. Derive the Laplace transforms for “Uniform”, “Exponential”, and “Erlang” random variables. Then, derive the expected value and variance of each random variable via the Laplace transform.

(27)

Moment Generating Functions

• The moment generating function M_x_˜(θ) of the random variable ˜x is defined for all values θ by

M_x_˜(θ) = E[e^θ˜^x]

=









X

x

e^θxp(x), if ˜x is discrete Z _∞

−∞

e^θxf (x)dx, if ˜x is continuous

• The nth derivative of M_x_˜(θ) evaluated at θ = 0 equals the nth moment of ˜x, E[˜xⁿ], that is,

M_x_˜⁽ⁿ⁾(0) = E[˜xⁿ], n ≥ 1

(28)

Markov’s Inequality

• Let h be a nonnegative and nondecreasing function and let ˜x be a random variable. If the expectation of h(˜x) exists then it is given by

E[h(˜x)] =

Z _∞

−∞

h(z)f_x_˜(z)dz. (1)

• By assumptions on h it easily follows that Z _∞

−∞

h(z)f_x_˜(z)dz ≥

Z _∞

t

h(z)f_x_˜(z)dz ≥ h(t)

Z _∞

t

f_x_˜(z)dz. (2)

• Combining (1) and (2) yields Markov’s inequality:

P [˜x ≥ t] ≤ E[h(˜x)]

h(t) , Markov’s Inequality.

(29)

Markov’s Inequality

• The simple Markov’s inequality is a first-order inequality since only knowledge of E[˜x] is required.

• The simple Markov’s inequality is quite weak but can be used to quickly check statements made about the tail of a distribution of a random variable when the expectation is known.

(x) f^x^~

• Example. If the expected response time of a computer system is 1 second, then the simple Markov’s inequality shows that P [˜x ≥ 10] ≤ .1 and thus at

(30)

Chebyshev’s Inequality – second-order bound

If ˜x is a random variable with mean µ and variance σ², k > 0, then P (|˜x − µ| ≥ k) ≤ σ²

k² P roof :

Since (˜x − µ)² is a non-negative random variable, applying Markov’s inequality yields

P ((˜x − µ)² ≥ k²) ≤ E[(˜x − µ)²] k²

P (|˜x − µ| ≥ k) ≤ σ² k²

(31)

Chernoff’s Bound

If ˜x is a random variable with moment generating function M_x_˜(t) = E[e^t˜^x], then, for a > 0, we have

P (˜x ≥ a) ≤ inf

t≥0e^−taM_x_˜(t) ≤ e^−taM_x_˜(t) ∀t > 0 (P (˜x ≤ a) ≤ e^−taM_x_˜(t) ∀t < 0) → exercise P roof :

t > 0 : P (˜x ≥ a) = P (e^t˜^x ≥ e^ta) (^·_·^·t > 0)

≤ E[e^t˜^x]

e^ta = e^−taM_x_˜(t)

<Homework>. Derive the tightest Chernoff’s Bound for Poisson random variable

(32)

Jensen’s Inequality

Lemma. Let h be a convex function. Define the linear function g that is tangent to h at the point a as follows:

g(x, a) ^def= h(a) + h⁽¹⁾(a)(x − a).

Then,

g(x, a) ≤ h(x), for all x.

(33)

Jensen’s Inequality

Jensen’s Inequality. If h is a differentiable convex function, defined on real variables, then

E[h(˜x)] ≥ h(E[˜x]).

P roof :

From the previous lemma, we have

h(˜x) ≥ h(a) + h⁰(a)(˜x − a) Let a = E[˜x]. Taking E[ ] on both sides yields

E[h(˜x)] ≥ h(E[˜x]) + h⁰(a)[E[˜x] − E[˜x]]

= h(E[˜x])

(34)

Limit Theorems

Theorem (Weak Law of Large Numbers): Let ˜S_n = ˜x₁ + ˜x₂ + . . . + ˜x_n, where ˜x₁, ˜x₂, . . . ˜x_n, . . . are i.i.d. random variables with f inite mean E[˜x], then for any ε > 0,

n→∞lim P (|S˜_n

n − E[˜x]| ≥ ε) = 0

Theorem (Strong Law of Large Numbers): Let ˜S_n = ˜x₁ + ˜x₂ + . . . + ˜x_n, where ˜x₁, ˜x₂, . . . ˜x_n, . . . are i.i.d. random variables with f inite mean E[˜x], then for any ε > 0,

P ( lim

n→∞|S˜_n

n − E[˜x]| ≥ ε) = 0

(35)

Limit Theorems

Theorem (Central Limit Theorem): Let ˜S_n = ˜x₁ + ˜x₂ + . . . + ˜x_n, where

˜

x₁, ˜x₂, . . . , ˜x_n are i.i.d. random variables with finite mean E[˜x] and finite variance σ_x²_˜ < ∞, then,

n→∞lim P

ÃS˜_n − nE[˜x]

√nσ ≤ y

!

=

Z _y

−∞

√1

2πe−x² 2 dx

∼ N (0, 1)

Normalized Gaussian distribution

(36)

Probability

Q1. The number of packets departing from a network switch is assumed to possess Poisson distribution. The mean interdeparture time is 10 seconds.

P [˜x = k] = e^−λt(λt)^k

k! (3)

f_t_˜(t) = λe^−λt (4)

(a) What is the probability that the switch has no packet departure within three minutes? Derive your exact answer from two possible distributions.

(b) What is the probability of having less than 5 packets departing from the switch within three minutes? Derive your answer from two possible

distributions.

(37)

Probability

Q2. Let ˜x and ˜y be independent exponential distributed random variables, with parameters α and β, respectively. Find P [˜x < ˜y] using conditional probability.

Q3. Prove that E[˜x] = E[E[˜x|˜y]]

Q4. A message requires ˜n time units to be transmitted, where ˜n is a geometric random variable with pmf p_j = (1 − a)a^j−1, j = 1, 2, ... A single new message arrives during a time unit with probability q, and no messages arrive with probability 1 − q. Let ˜x be the number of new messages that arrive during the transmission of a single message.

(a) Find the pmf of ˜x. (Hint: (1 − β)^−(k+1) = P_∞

n=k

Ã n k

!

β^n−k.) (b) Find E[˜x] and V [˜x] using conditional expectation.