Statistics
The Normal Distribution and its Applications
Shiu-Sheng Chen
Department of Economics National Taiwan University
Fall 2019
Section 1
Normal Distributions
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 2 / 47
Normal Random Variables
Normal distribution is also called Gaussian distribution, which is named after German mathematician Johann Carl Friedrich Gauss (1777–1855)
Normal Random Variables
Definition (Normal Distribution)
A random variable X has the normal distribution with two parameters µ and σ2 if X has a continuous distribution with the following pdf:
f (x) = 1 σ√
2π e−21(x−µσ )2
where supp(X) = {x∣ − ∞ < x < ∞}, and π ≑ 3.14159. It is denoted by X ∼ N(µ, σ2)
Via converting from Cartesian to polar coordinates (google
“Gaussian integral”)
∫
∞
−∞
1 σ√
2π e−21(x−µσ )2 = 1
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 4 / 47
Standard Normal Random Variables
Definition (Standard Normal Distribution)
A random variable Z is called a standard normal random variable, if µ = 0 and σ = 1 with pdf
1
√2πe−21z2 It is denoted by Z ∼ N(0, 1)
As a conventional notation, we let ϕ and Φ denote pdf and CDF of a standard normal random variable,
ϕ(z) = 1
√ e−12z2, Φ(z) = z ϕ(w)dw.
Normal Distributions
Solid line: N(0, 1) vs. Dashed line: N(0, 9) Skewness γ3= 0; Kurtosis γ4= 3
−10 −5 0 5 10
0.00.10.20.30.4
x
f(x)
N(0,1)
N(0,9)
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 6 / 47
Stock Returns (S&P500)
Histogram of Stock Returns
Density 0.020.040.060.080.100.12
Stock Returns (S&P500)
Histogram of Stock Returns
r
Density
−20 −10 0 10
0.000.020.040.060.080.100.12
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 8 / 47
Exchange Rate Changes (British Pound)
Histogram of Exchange Rate Returns
Density 0.050.100.15
Exchange Rate Changes (British Pound)
Histogram of Exchange Rate Returns
rs
Density
−10 −5 0 5 10
0.000.050.100.15
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 10 / 47
Normal vs. Standard Normal
Theorem
Let Z ∼ N(0, 1) and X = σ Z + µ, then X ∼ N(µ, σ2) Proof: by CDF method.
In the same vein, you can show that if X ∼ N(µ, σ2) and Z = X−µσ , then Z ∼ N(0, 1).
Moment Generating Function
Theorem (MGF)
Let Z ∼ N(0, 1), the MGF of Z is
MZ(t) = e21t2 Proof: by definition.
It follows that if X ∼ N(µ, σ2), the MGF of X is MX(t) = eµt+12σ2t2 Hence, E(X) = µ, Var(X) = σ2
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 12 / 47
Properties
Theorem (Invariance Under Linear Transformations) If X ∼ N(µ, σ2), then
aX + b ∼ N(aµ + b, a2σ2), a ≠ 0.
Proof: by MGF
Example: a portfolio consisting of a stock and a risk-free asset
Properties
Theorem (Sum of I.I.D. Normal Random Variables) If {Xi}ni=1∼i.i.d. N(µ, σ2), and
W = α1X1+α2X2+ ⋯ +αnXn, then
W ∼ N (µ
n
∑
i=1
αi,σ2
n
∑
i=1
α2i) .
Proof: by MGF
Consider two special cases:
αi = 1for all i αi = n1 for all i
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 14 / 47
Properties
Theorem
If {Xi}ni=1∼i.i.d. N(µ, σ2), then Y =
n
∑
i=1
Xi ∼N(nµ, nσ2).
If {Xi}ni=1∼i.i.d. N(µ, σ2), then X =¯ ∑ni=1Xi
n ∼N (µ,σ2 n) .
Finding Normal Probabilities
However, sometimes what we have is the N(0, 1) table.
When calculating the probability of a normal distribution, we need to transform it to the standard normal distribution.
Then calculate Φ(a) = P(Z ≤ a) For instance, if X ∼ N(5, 16),
P(X ≤ 3) = P (X − 5
4 ≤
3 − 5
4 ) =P(Z ≤ −0.5) = Φ(−0.5) The following properties will be helpful
P(Z ≤ 0) = P(Z ≥ 0) = 0.5 P(Z ≤ −a) = P(Z ≥ a)
P(−a ≤ Z ≤ 0) = P(0 ≤ Z ≤ a)
TA will teach you how to use the N(0, 1) table
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 16 / 47
Z Table
Using Normal Probabilities to Find Quantiles
Example: Suppose the final grade, X is normally distributed with mean 70 and standard deviation 10. The instructor wants to give 10% of the class an A+. What cutoff should the instructor use to determine who gets an A+?
Clearly, X ∼ N(70, 100), and we want to find the constant q such that P(X > q) = 0.10 or P(X ≤ q) = 0.90.
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 18 / 47
Using Z Table
Find q such that
0.90 = P(X ≤ q) = P (X − 70
10 ≤ q − 70
10 ) =P (Z ≤ q − 70 10 ) According to the Z Table, P(Z ≤ 1.28) = 0.90, we have
q − 70 10 = 1.28 That is,
c = 82.8
Example: Mean-Variance Utility
Suppose that the utility function from wealth W is given by U(W) = c − e−bW, b > 0
This utility function is increasing and concave
U′(W) = be−bW > 0, U′′(W) = −b2e−bW< 0 We further assume that W ∼ N(µ, σ2), then
E[U(W)] = c − E [e−bW] =c − e−b(µ−b2σ2)=g(µ, σ2) Hence,
∂E[U(W)]
∂µ =gµ> 0, ∂E[U(W)]
∂(σ2)
=gσ2 < 0
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 20 / 47
Bivariate Normal Random Variables
Definition (Bivariate Normal Distribution)
X and Y are bivariate normal distributed if the joint pdf is fXY(x, y) = eϖ
2π√
(1 −ρ2)σXσY
where
ϖ = − 1 2(1 −ρ2)
[(x − µX
σX
)
2
− 2ρ (x − µX
σX
) (y − µY
σY
) + (y − µY
σY
)
2
]
Bivariate Normal Random Variables It is denoted by
⎡⎢
⎢
⎢⎢
⎣ X Y
⎤⎥
⎥
⎥⎥
⎦
∼N⎛
⎝
⎡⎢
⎢
⎢⎢
⎣ µX
µY
⎤⎥
⎥
⎥⎥
⎦ ,
⎡⎢
⎢
⎢⎢
⎣
σX2 σXY
σXY σY2
⎤⎥
⎥
⎥⎥
⎦
⎞
⎠
x
y z
Bivariate Normal Density
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 22 / 47
Independent vs. Uncorrelated
Theorem
Given that X and Y have a bivariate normal distribution. X and Y are independent if and only if
Cov(X, Y) = 0.
Proof: we only need to show the “if” part. Clearly, when Cov(X, Y) = 0, which implies ρ = 0, it can be shown that
fXY(x, y) = fX(x) fY(y)
Symmetry vs. Asymmetry
Normal distribution is a symmetricdistribution.
In some instances, we require a skewed distribution to characterize the data.
For example,
the size of insurance claims time-until-default (survival time)
We thus introduce a random variable called Chi-square random variable to capture the asymmetrical characteristics.
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 24 / 47
Section 3
Chi-square Distribution
Chi-square Distribution
Definition (Chi-square Random Variables)
A random variable X has the Chi-square distribution if the pdf is f (x) = xk2−1
2k2Γ(k2)
e−21x, supp(X) = {x∣0 < x < ∞}
where k is a positive integer called the degree of freedom.
Γ(⋅) is called a Gamma function:
Γ(α) =∫
∞
0 xα−1e−xdx
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 26 / 47
Chi-square distributions (k = 3)
0 5 10 15 20
0.000.050.100.150.200.25
dchisq(x, df = 3)
Chi-square Distribution
Theorem
The MGF of a Chi-square random variable is MX(t) = ( 1
1 − 2t)
k 2
Proof: by the definition of MGF, and let y = (1 − 2t)x.
Hence,
E(X) = k, Var(X) = 2k
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 28 / 47
Chi-square Distribution
Theorem
If Xi ∼χ2(ki) for i = 1, 2, . . . n, and they are independent, then
n
∑
i=1
Xi ∼χ2(
n
∑
i=1
ki)
That is, the sum X1+X2+ ⋯ +Xn has the χ2 distribution with k1+k2+ ⋯ +kn degrees of freedom.
Proof: by MGF.
Chi-square Distribution
The following theorem links the normal distribution and Chi-square distribution.
Theorem
Let Z ∼ N(0, 1). Then the random variable Y = Z2∼ χ2(1) Proof:
MZ2(t) = E(etZ2) = ∫
∞
−∞
etz2 1
√2πe−21z2dz
= ∫
∞
−∞
1
√2πe−12(1−2t)z2dz = ( 1 1 − 2t)
1 2
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 30 / 47
Chi-square Distributions
Corollary
Suppose {Z1,Z2. . . ,Zk} ∼i.i.d. N(0, 1). Let X = ∑ki=1Z2i. Then X ∼ χ2(k)
Proof: by the above two theorems.
Degree of freedom: The number of values in the final calculation of a statistic that are free to vary
Section 4
Student’s t Distribution
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 32 / 47
Student’s t Distributions
Recall that the Kurtosis for Normal random variables is 3 Monthly S&P 500 Stock Returns (1957:1–2013:9):
Kurtosis =5.51
Daily S&P 500 Stock Returns (1957/1/2–2013/9/30):
Kurtosis =30.75
Fat-tailed/Heavy-tailed
Student’s t distributions
The Student’s t distribution was actually published in 1908 by a British statistician, William Sealy Gosset (1876–1937).
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 34 / 47
William Sealy Gosset
Gosset, was employed at the Guinness Brewing Co., which forbade its staffs publishing scientific papers due to an earlier paper containing trade secrets.
To circumvent this restriction, Gosset used the name “Student”, and consequently the distribution was named Student’s t
distribution.
Guinness and the 1908 Biometrika Paper
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 36 / 47
Student’s t distributions
Definition (Student’s t distribution)
If a random variable X has the following pdf Γ(k+12 )
Γ(k2) 1
√kπ(1 + x2 k )
−k+12
with support supp(X) = {x∣ − ∞ < x < ∞} and a parameter k, then it is called a Student’s t distribution, and denoted by
X ∼ t(k)
Student’s t distributions
Theorem
Given two independent random variables: Z ∼ N(0, 1) and W ∼ χ2(k). Then
U = Z
√
W k
∼t(k)
Proof: beyond the scope of this course.
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 38 / 47
Student’s t distribution
Given U ∼ t(k) Moments
E(U) = 0 when k > 1, Var(U) = E(U2) = k
k − 2 when k > 2.
Note that given W ∼ χ2(k),
E ( 1 W) =
1 k − 2
Student’s t distributions
t distribution is a symmetric distribution.
Limiting distribution
t(k) Ð→ N(0, 1) as k Ð→ ∞.
Special case: k = 1, E[t(1)] = ∞ − ∞ (undefined) t(1) is called astandard Cauchy distribution.
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 40 / 47
Comparison: t(2) vs. N(0,1)
Clearly, the Student’s t distribution has a fat tail.
0.00.10.20.30.4
dnorm(x)
N(0,1)
t(2)
Section 5 F Distributions
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 42 / 47
F Distribution
Definition (F Distribution)
Random X has the F distribution with n1 and n2 degrees of freedom if the probability density function is
Γ(n1+2n2) Γ(n21)Γ(n22)
(n1 n2)
n1 2
xn12−1(1 + n1 n2x)
−n1+n22
with supp(X) = {x∣0 ≤ x < ∞}. It is denoted by X ∼ F(n1,n2)
R.A. Fisher and G. W. Snedecor
Sir Ronald A. Fisher (1890–1962), British statistician, evolutionary biologist, eugenicist, and geneticist
George W. Snedecor (1881–1974), American mathematician and statistician
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 44 / 47
F Distribution (n1= 10, n2 = 10)
0.00.20.40.6
df(x, df1 = 10, df2 = 10)
F Distribution
Theorem
Let W1 and W2 be independent Chi-square random variables:
W1∼ χ2(n1), and W2∼ χ2(n2), then
X = W1/n1
W2/n2 ∼F(n1,n2) Proof: beyond the scope of this course.
Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 46 / 47
F Distribution
Theorem If t ∼ t(k), then
t2 ∼F(1, k)