Advanced Computer Network
Prof. Ai-Chun Pang
Graduate Institute of Networking and Multimedia,
Department of Computer Science and Information Engineering,
National Taiwan University, Taiwan
Outline
• Preliminaries
• Poisson Processes
• Renewal Processes
• Discrete-Time Markov Chains (Optional)
Preliminaries
• Applied Probability and Performance Modeling – Prototype
– System Simulation – Probabilistic Model
• Introduction to Stochastic Processes – Random Variable (R.V.)
– Stochastic Process
• Probability and Expectations – Expectation
– Generating Functions for Discrete R.V.s
Preliminaries
• Probability Inequalities
– Markov’s Inequality (mean)
– Chebyshev’s Inequality (mean and variance)
– Chernoff’s Bound (moment generating function) – Jensen’s Inequality
• Limit Theorems
– Strong Law of Large Numbers – Weak Law of Large Numbers – Central Limit Theorem
Applied Probability and Performance Modeling
• Prototyping
– complex and expensive
– provides information on absolute performace measures but little on relative performance of different designs
• System Simulation
– large amount of execution time
– could provide both absolute and relative performance depending on the level of detail that is modeled
• Probabilistic Model
– mathematically intractable or unsolvable
A Single Server Queue
• Arrivals: Poisson process, renewal process, etc.
• Queue length: Markov process, semi-Markov process, etc.
• . . .
Random Variable
• A “random variable” is a real-valued function whose domain is a sample space.
• Example. Suppose that our experiment consists of tossing 3 fair coins. If we let ˜y denote the number of heads appearing, then ˜y is a random variable taking on one of the values 0, 1, 2, 3 with respective probabilities
P {˜y = 0} = P {(T, T, T )} = 1 8
P {˜y = 1} = P {(T, T, H), (T, H, T ), (H, T, T )} = 3 8 P {˜y = 2} = P {(T, H, H), (H, T, H), (H, H, T )} = 3
8 P {˜y = 3} = P {(H, H, H)} = 1
8
Random Variable
• A random variable ˜x is said to be “discrete” if it can take on only a finite number—or a countable infinity—of possible values x.
• A random variable ˜x is said to be “continuous” if there exists a nonnegative function f , defined for all real x ∈ (−∞, ∞), having the property that for any set B of real numbers
P {˜x ∈ B} = Z
B
f (x)dx
Stochastic Process
• A “stochastic process” X = {˜x(t), t ∈ T } is a collection of random variables.
That is, for each t ∈ T , ˜x(t) is a random variable.
• The index t is often interpreted as “time” and, as a result, we refer to ˜x(t) as the “state” of the process at time t.
• When the index set T of the process X is
– a countable set → X is a discrete-time process
– an interval of the real line → X is a continuous-time process
• When the state space S of the process X is
– a countable set → X has a discrete state space
– an interval of the real line → X has a continuous state space
Stochastic Process
• Four types of stochastic processes
– discrete time and discrete state space – continuous time and discrete state space – discrete time and continuous state space – continuous time and continuous state space
Discrete Time with Discrete State Space
Continuous Time with Discrete State Space
! "# #
Discrete Time with Continuous State Space
Continuous Time with Continuous State Space
Two Structural Properties of stochastic processes
a. Independent increment: if for all t0 < t1 < t2 < . . . < tn in the process X = {˜x(t), t ≥ 0}, random variables
˜
x(t1) − ˜x(t0), ˜x(t2) − ˜x(t1), . . . , ˜x(tn) − ˜x(tn−1) are independent,
⇒ the magnitudes of state change over non-overlapping time intervals are mutually independent
b. Stationary increment: if the random variable ˜x(t + s) − ˜x(t) has the same probability distribution for all t and any s > 0,
⇒ the probability distribution governing the magnitude of state change depends only on the difference in the lengths of the time indices and is independent of the time origin used for the indexing variable
⇓
X = {˜x1, ˜x2, ˜x3, . . . , ˜x∞}
limiting behavior of the stochastic process
Two Structural Properties of stochastic processes
• both independent and stationary increments,
• neither independent nor stationary increments,
• independent but not stationary increments, and
• stationary but not independent increments.
Expectations by Conditioning
Denote by E[˜x|˜y] that function of the random variable ˜y whose value at ˜y = y is E[˜x|˜y = y].
⇒ E[˜x] = E[E[˜x|˜y]]
If ˜y is a discrete random variable, then E[˜x] = X
y
E[˜x|˜y = y]P {˜y = y}
If ˜y is continuous with density fy˜(y), then E[˜x] =
Z ∞
−∞
E[˜x|˜y = y]fy˜(y)dy
Expectations by Complementary Distribution
For any non-negative random variable ˜x E[˜x] =
X∞ k=0
p(˜x > k) discrete
E[˜x] =
Z ∞
0
[1 − Fx˜(x)]dx continuous
. . . .
Expectations by Complementary Distribution
Discrete case:
E[˜x] = 0 · P (˜x = 0) + 1 · P (˜x = 1) + 2 · P (˜x = 2) + . . . (horizontal sum)
= [1 − P (˜x < 1)] + [1 − P (˜x < 2)] + . . . (vertical sum)
= P (˜x ≥ 1) + P (˜x ≥ 2) + . . .
=
X∞ k=1
P (˜x ≥ k) (or
X∞ k=0
P (˜x > k))
~x
x~
x~ x≦
~
Expectations by Complementary Distribution
Continuous case:
E[˜x] =
Z ∞
0
x · fx˜(x)dx
=
Z ∞
0
µZ x
0
dz
¶
· fx˜(x)dx
=
Z ∞
0
·Z ∞
z
fx˜(x)dx
¸
· dz
=
Z ∞
0
[1 − Fx˜(z)]dz
Compound Random Variable
S˜n˜ = ˜x1 + ˜x2 + ˜x3 + . . . + ˜xn˜, where ˜n ≥ 1 and
˜
xi are i.i.d. random variables.
⇒ E[ ˜Sn˜] =? V ar[ ˜Sn˜] =?
. . . . E[ ˜Sn˜] = E[E[ ˜Sn˜|˜n]]
=
X∞ n=1
E[ ˜Sn˜|˜n = n] · P (˜n = n)
=
X∞ n=1
E[˜x1 + ˜x2 + . . . + ˜xn] · P (˜n = n)
=
X∞
n · E[˜x1] · P (˜n = n)
Compound Random Variable
Since V ar[˜x] = E[V ar[˜x|˜y]] + V ar[E[˜x|˜y]], we have
V ar[ ˜Sn˜] = E[V ar[ ˜Sn˜|˜n]] + V ar[E[ ˜Sn˜|˜n]]
= E[˜nV ar[˜x1]] + V ar[˜nE[˜x1]]
= V ar[˜x1]E[˜n] + E2[˜x1]V ar[˜n]
Probability Generating Functions for Discrete R.V.s
• Define the generating function or Z-transform for a sequence of numbers {an} as ag(z) = P∞
n=0 anzn.
• Let ˜x denote a discrete random variable and an = P [˜x = n]. Then Px˜(z) = ag(z) = P∞
n=0 anzn = E[zx˜] is called the probability generating function for the random variable ˜x.
• Define the kth derivative of Px˜(z) by
Px˜(k)(z) = dk
dzk Px˜(z).
Then, we see that
Px˜(1)(z) =
X∞ n=0
nanzn−1 → Px˜(1)(1) = E[˜x]
and
Probability Generating Functions for Discrete R.V.s
• See Table 1.1 [Kao] for the properties of generating functions.
• <Homework>. Derive the probability generating functions for “Binomial”,
“Poisson”, “Geometric” and “Negative Binomial” random variables. Then, derive the expected value and variance of each random variable via the
probability generating function.
Laplace Transforms for Continuous R.V.s
• Let f be any real-valued function defined on [0, ∞). The Laplace transform of f is defined as
F∗(s) =
Z ∞
0
e−stf (t)dt.
• When f is a probability density of a nonnegative continuous random variable
˜
x, we have
Fx˜∗(s) = E[e−s˜x]
• Define the nth derivative of the Laplace transform Fx˜∗(s) with respect to s by Fx˜∗(n)(s) = dn
dsn Fx˜∗(s) → Fx˜∗(n)(s) = (−1)nE[˜xne−s˜x].
Then, we see that
E[˜xn] = (−1)nF∗(n)(0)
Laplace Transforms for Continuous R.V.s
• <Homework>. Derive the Laplace transforms for “Uniform”, “Exponential”, and “Erlang” random variables. Then, derive the expected value and variance of each random variable via the Laplace transform.
Moment Generating Functions
• The moment generating function Mx˜(θ) of the random variable ˜x is defined for all values θ by
Mx˜(θ) = E[eθ˜x]
=
X
x
eθxp(x), if ˜x is discrete Z ∞
−∞
eθxf (x)dx, if ˜x is continuous
• The nth derivative of Mx˜(θ) evaluated at θ = 0 equals the nth moment of ˜x, E[˜xn], that is,
Mx˜(n)(0) = E[˜xn], n ≥ 1
Markov’s Inequality
• Let h be a nonnegative and nondecreasing function and let ˜x be a random variable. If the expectation of h(˜x) exists then it is given by
E[h(˜x)] =
Z ∞
−∞
h(z)fx˜(z)dz. (1)
• By assumptions on h it easily follows that Z ∞
−∞
h(z)fx˜(z)dz ≥
Z ∞
t
h(z)fx˜(z)dz ≥ h(t)
Z ∞
t
fx˜(z)dz. (2)
• Combining (1) and (2) yields Markov’s inequality:
P [˜x ≥ t] ≤ E[h(˜x)]
h(t) , Markov’s Inequality.
Markov’s Inequality
• The simple Markov’s inequality is a first-order inequality since only knowledge of E[˜x] is required.
• The simple Markov’s inequality is quite weak but can be used to quickly check statements made about the tail of a distribution of a random variable when the expectation is known.
(x) fx~
• Example. If the expected response time of a computer system is 1 second, then the simple Markov’s inequality shows that P [˜x ≥ 10] ≤ .1 and thus at
Chebyshev’s Inequality – second-order bound
If ˜x is a random variable with mean µ and variance σ2, k > 0, then P (|˜x − µ| ≥ k) ≤ σ2
k2 P roof :
Since (˜x − µ)2 is a non-negative random variable, applying Markov’s inequality yields
P ((˜x − µ)2 ≥ k2) ≤ E[(˜x − µ)2] k2
P (|˜x − µ| ≥ k) ≤ σ2 k2
Chernoff’s Bound
If ˜x is a random variable with moment generating function Mx˜(t) = E[et˜x], then, for a > 0, we have
P (˜x ≥ a) ≤ inf
t≥0e−taMx˜(t) ≤ e−taMx˜(t) ∀t > 0 (P (˜x ≤ a) ≤ e−taMx˜(t) ∀t < 0) → exercise P roof :
t > 0 : P (˜x ≥ a) = P (et˜x ≥ eta) (···t > 0)
≤ E[et˜x]
eta = e−taMx˜(t)
<Homework>. Derive the tightest Chernoff’s Bound for Poisson random variable
Jensen’s Inequality
Lemma. Let h be a convex function. Define the linear function g that is tangent to h at the point a as follows:
g(x, a) def= h(a) + h(1)(a)(x − a).
Then,
g(x, a) ≤ h(x), for all x.
Jensen’s Inequality
Jensen’s Inequality. If h is a differentiable convex function, defined on real variables, then
E[h(˜x)] ≥ h(E[˜x]).
P roof :
From the previous lemma, we have
h(˜x) ≥ h(a) + h0(a)(˜x − a) Let a = E[˜x]. Taking E[ ] on both sides yields
E[h(˜x)] ≥ h(E[˜x]) + h0(a)[E[˜x] − E[˜x]]
= h(E[˜x])
Limit Theorems
Theorem (Weak Law of Large Numbers): Let ˜Sn = ˜x1 + ˜x2 + . . . + ˜xn, where ˜x1, ˜x2, . . . ˜xn, . . . are i.i.d. random variables with f inite mean E[˜x], then for any ε > 0,
n→∞lim P (|S˜n
n − E[˜x]| ≥ ε) = 0
Theorem (Strong Law of Large Numbers): Let ˜Sn = ˜x1 + ˜x2 + . . . + ˜xn, where ˜x1, ˜x2, . . . ˜xn, . . . are i.i.d. random variables with f inite mean E[˜x], then for any ε > 0,
P ( lim
n→∞|S˜n
n − E[˜x]| ≥ ε) = 0
Limit Theorems
Theorem (Central Limit Theorem): Let ˜Sn = ˜x1 + ˜x2 + . . . + ˜xn, where
˜
x1, ˜x2, . . . , ˜xn are i.i.d. random variables with finite mean E[˜x] and finite variance σx2˜ < ∞, then,
n→∞lim P
ÃS˜n − nE[˜x]
√nσ ≤ y
!
=
Z y
−∞
√1
2πe−x2 2 dx
∼ N (0, 1)
Normalized Gaussian distribution
Probability
Q1. The number of packets departing from a network switch is assumed to possess Poisson distribution. The mean interdeparture time is 10 seconds.
P [˜x = k] = e−λt(λt)k
k! (3)
ft˜(t) = λe−λt (4)
(a) What is the probability that the switch has no packet departure within three minutes? Derive your exact answer from two possible distributions.
(b) What is the probability of having less than 5 packets departing from the switch within three minutes? Derive your answer from two possible
distributions.
Probability
Q2. Let ˜x and ˜y be independent exponential distributed random variables, with parameters α and β, respectively. Find P [˜x < ˜y] using conditional probability.
Q3. Prove that E[˜x] = E[E[˜x|˜y]]
Q4. A message requires ˜n time units to be transmitted, where ˜n is a geometric random variable with pmf pj = (1 − a)aj−1, j = 1, 2, ... A single new message arrives during a time unit with probability q, and no messages arrive with probability 1 − q. Let ˜x be the number of new messages that arrive during the transmission of a single message.
(a) Find the pmf of ˜x. (Hint: (1 − β)−(k+1) = P∞
n=k
à n k
!
βn−k.) (b) Find E[˜x] and V [˜x] using conditional expectation.