Statistics Random Variables Shiu-Sheng Chen

(1)

Statistics

Random Variables

Shiu-Sheng Chen

Department of Economics National Taiwan University

Fall 2019

(2)

Section 1 Random Variables

(3)

Random Variables

In practice we are interested in certain numerical measurements pertaining to a random experiment.

For example,

Ω= {H, T}

X=⎧⎪⎪⎪

⎨⎪⎪⎪⎩

1, if H

−1, if T Then X is called arandom variable.

(4)

Random Variables: A Formal Definition

Definition

A random variable X is a real-value functionfrom the sample space to the real numbers:

X∶ Ω z→ R

and it assigns to each element ω∈ Ω one and only one real number X(ω) = x.

Small letter x denotes the possible value of a random variable X.

(5)

Example: Flip a Fair Coin Twice

The sample space is

Ω= {HH, HT, TH, TT}

Let X be the number of heads.

The mapping is

ω X(ω)

{HH} 2

{HT} 1

{TH} 1

{TT} 0

(6)

Example: Flip a Fair Coin Twice

How to assign probability P(X = x)?

For example,

P(X = 1) = P({ω ∶ X(ω) = 1})

= P({HT, TH})

= P({HT} ∪ {TH})

= P({HT}) + P({TH}) = 1 4 +

1 4 =

1 2

(7)

Random Variables

In the previous example, the set of all possible values that X can assume are finite.

According to the set of all possible values that a random variable can assume, we define two types of random variables.

(1) Discrete random variables (2) Continuous random variables

(8)

Section 2 Discrete Random Variables

(9)

Discrete Random Variables

A random variable X is a discrete random variable if:

there are a finite number of possible values of X, or

there are a countably infinite number of possible values of X.

Recall that a countably infinite number of possible values means that there is a one-to-one correspondence between the values and the set of positive integers.

(10)

Examples

The number of defective light bulbs in a box of six.

Set of all possible values of X= {0, 1, 2, 3, 4, 5, 6}

The number of tails until the first heads comes up.

Set of all possible values of X= {0, 1, 2, 3, . . .}

We use the probability distribution to describe the likelihood of obtaining the possible values that a random variable can assume.

(11)

Probability Distribution

Definition (Probability Distribution)

Let X be a random variable. The probability distribution of X is to specify all probabilities involving X.

One way to specify the probability distribution of discrete random variables is the probability mass function.

(12)

Probability Mass Function

Definition (Probability Mass Function)

Given a discrete random variable X. A probability mass function (pmf), f(x) ∶ R ↦ [0, 1] is defined by

f(x) = P(X = x)

A probability mass function is also called a discrete probability density function (discrete pdf).

A preferable notation: fX(x)

(13)

Probability Mass Function

Definition (Support)

The support of a random variable X is defined as:

supp(X) = {x ∶ f (x) > 0}

Properties:

x∈supp(X)∑

f(x) = 1

(14)

An Example of pmf

𝑓(𝑥)

𝑥1 𝑥2 𝑥3 𝑥4 𝑥5 𝑥6 𝑥

(15)

Example 1: Flip a Fair Coin Twice

The sample space is Ω= {HH, HT, TH, TT}

Let X be the number of heads.

Table:Mapping and Probability Distribution

ω P({ω}) X(ω)

T T 1/4 0

TH 1/4 1

HT 1/4 1

HH 1/4 2

x f(x) = P(X = x)

0 1/4

1 1/2

2 1/4

Clearly, supp(X) = {0, 1, 2}, and ∑x∈supp(X) f(x) = 1 Q: let A= {X ≤ 1}, what is P(X ∈ A)?

(16)

Probability mass function (alternative notation)

f(x) = P(X = x) =⎧⎪⎪⎪⎪⎪⎪

⎨⎪⎪⎪⎪⎪⎪⎩

1/4 if x = 0 1/2 if x = 1 1/4 if x = 2

(17)

݂ሺݔሻ

ݔ

Ͳ ͳ ʹ

ͳ Ͷ ͳ ʹ

(18)

Example 2: Bernoulli Random Variable

Definition (Bernoulli Random Variable)

A random variable X is said to have a Bernoulli distribution with success probability p if X can only assume the values 0 and 1, with probabilities

f(1) = P(X = 1) = p, f(0) = P(X = 0) = 1 − p We write X∼ Bernoulli(p).

Find its support, and probability mass function.

(19)

Example 3: Binomial Random Variables

Definition (Binomial Random Variable)

A random variable Y has the binomial distribution with parameters n and p if the probability mass function is

f(y) = (n

y)p^y(1 − p)^n−y, supp(Y) = {y∣y = 0, 1, 2, . . . , n}

It is denoted by Y ∼ Binomial(n, p)

By the binomial theorem, (a + b)ⁿ= ∑ⁿy=0(ⁿ_y)a^yb^n−y,

∑n y=0

f(y) = 1

(20)

Example 3: Binomial Random Variables

Bernoulli vs. Binomial

Bernoulli(p) is used to model a single coin toss experiment.

Binomial(n, p) is used to model the number of heads in a sequence of nindependentcoin toss experiment.

Clearly, they are linked by

Y = X¹+ X²+ ⋯ + Xn,

where Y ∼Binomial(n, p), and X¹,X², . . . ,Xn are independent Bernoulli(p) variables.

(21)

Consider X∼ Bernoulli(0.5)

−0.5 0.0 0.5 1.0 1.5

0.30.40.50.60.7

x

f(x)

(22)

Consider X∼ Binomial(10, 0.5)

0 2 4 6 8 10

0.000.050.100.150.200.25

f(x)

(23)

Section 3 Continuous Random Variables

(24)

Continuous Random Variables

A random variable is called continuous if it can takes on an uncountably infinite number of possible values.

The percentage of exam complete after 1 hour

The weight of a randomly selected quarter-pound burger

(25)

Though a continuous variable can take any possible value in an interval, its measured value cannot. This is because no measuring device has infinite precision.

Nevertheless, continuous random variables offer reasonable approximations to the underlying process of interest even though virtually all phenomena are, at some level, ultimately discrete.

(26)

How to Assign Probability?

Discrete: flip a coin or roll a die.

How about spinning a spinner?

Let X be the result of the spin.

Warning! Impossible to assign each outcome positive probability.

Why?

(27)

Suppose NOT, and let the spinner be fair.

Each outcome has probability p> 0.

Let A⊂ Ω be an event that contains n distinct outcomes.

⇒ Choose n large enough s.t. p > _n¹. Then P(X ∈ A) = np > 1 (Big Trouble!)

(28)

Hence p must be zero!

That is, if X is a continuous random variable, P(X = c) = 0

How can P(X = c) = 0 make sense? Can many nothings make something?

Think about the length of a point vs. the length of an interval.

A Zero-probability event is NOT an impossible event.

(29)

Definition (Continuous Random Variables)

A random variable X is continuous if there exists a function f ∶ R ↦ R and for any number a ≤ b,

P(a ≤ X ≤ b) = ∫_a^b f(x)dx

The function f(⋅) is called the probability density function(pdf).

(30)

Probability Density Function

In general, if the support for a continuous random variable is not specified, we assume that

supp(X) = {x ∶ −∞ < x < ∞}

The pdf is nonnegative

f(x) ≥ 0, ∀x The integral over the support of X is one

∫

∞

f(x)dx = 1

(31)

Probability Density Function

Since P(X = c) = 0 for any real value c, P(a ≤ X ≤ b)

= P(a < X < b)

= P(a ≤ X < b)

= P(a < X ≤ b)

(32)

Probability Mass Function vs. Probability Density Function

pmf (discrete pdf):

f(x) ∶ R ↦ [0, 1], f (x) = P(X = x) pdf:

f(x) ∶ R ↦ R⁺, f(x) ≠ P(X = x) That is, density is not probability.

(33)

Example 1: Uniform Random Variable

Definition (Uniform Random Variable)

A random variable X is said to have a uniform distribution on the interval [l, h] if its pdf is given by

f(x) = 1

h− l, l ≤ x ≤ h We write X∼ U[l, h].

The probability that X falls in the sub-interval [a, b] is P(a ≤ X ≤ b) = b− a

h− l.

(34)

Uniform Distribution

0.20.40.60.81.0

dunif(x, min = 0, max = 1)

(35)

Section 4 Cumulative Distribution Function

(36)

Cumulative Distribution Function

An alternative way to specify the probability distribution is to give the probabilities of all events of the form

{X ≤ x}, x ∈ R

For example, what is the probability that the resulting number by rolling a die is smaller than 3.8?

This leads to the following definition of cumulative distribution function.

(37)

Definition (Cumulated Distribution Function)

Given any real variable x, a function F(x) ∶ R ↦ [0, 1]:

F(x) = P(X ≤ x)

is called a cumulated distribution function (CDF), or distribution function.

A preferable notation: F_X(x)

It should be emphasized that the cumulative distribution function is defined as above for every random variable X, regardless of whether the distribution of X is discrete, or continuous.

(38)

If X is discrete,

F(x) = P(X ≤ x) = ∑

u≤x

P(X = u) = ∑

u≤x

f(u) If X is continuous,

F(x) = P(X ≤ x) = ∫_−∞^x f(u)du

𝑓(𝑢)

(39)

Theorem (Properties of CDF)

Let F(x) be the CDF of a random variable X. Then,

If a< b, then F(a) ≤ F(b) and P(a < X ≤ b) = F(b) − F(a).

lim_x→−∞F(x) = 0, and limx→∞F(x) = 1 F(x) = limδ→0F(x + δ)

(40)

CDF: Discrete Random Variable supp(X) = {x¹,x²x³}

1

𝐹(𝑥)

𝑃(𝑋 = 𝑥₁) 𝑃(𝑋 = 𝑥₁)+𝑃(𝑋 = 𝑥2)

(41)

CDF: Continuous Random Variable

0 1

𝑥

𝐹(𝑥)

(42)

Example 1: Bernoulli(p)

Given the pmf of X∼Bernoulli(p),

f(x) =⎧⎪⎪⎪

⎨⎪⎪⎪⎩

p, x = 1 1− p, x = 0 The CDF is

F(x) = P(X ≤ x) =⎧⎪⎪⎪⎪⎪⎪

⎨⎪⎪⎪⎪⎪⎪⎩

0, x < 0 1− p, 0 ≤ x < 1 1, 1≤ x

(43)

Example 1: Bernoulli(p)

𝑥

0 1

1 𝐹(𝑥)

1 − 𝑃

(44)

Example 2: Uniform[l, h]

Given X∼ Uniform[l, h], the pdf is f(x) = 1

h− l The CDF is

F(x) = ∫_l ^x f(u)du = ∫_l ^x 1

h− ldu= x − l h− l

(45)

Example 2: Uniform[l, h]

𝑥 1

𝐹(𝑥)

0 𝑙 ℎ

(46)

Section 5 Quantiles

(47)

Quantiles

Definition (Quantiles)

Let F denote the CDF of a random variable X. The function π_p= F⁻¹(p) = inf{x∣F(x) ≥ p}

is called the 100 p-th quantile of X. F⁻¹(⋅) is called the inverse distribution function.

Given p= 0.5, the 50-th quantile, π^0.5, is called the median.

If F is strictly increasing,

π_p= F⁻¹(p) = {x∣F(x) = p}