2 Linear Algebra

(1)

Machine Learning (NTU, Fall 2009) instructor: Hsuan-Tien Lin

Homework #0

TA in charge: Chun-Wei Liu RELEASE DATE: 09/14/2009

DUE DATE: NONE

1 Probability and Statistics

(1) (combinatorics)

Let C(N, K) = 1 for K = 0 or K = N , and C(N, K) = C(N − 1, K) + C(N − 1, K − 1) for N ≥ 1.

Prove that C(N, K) = _{K!(N −K)!}^{N !} for N ≥ 1 and 0 ≤ K ≤ N . (2) (counting)

What is the probability of getting exactly 6 heads when flipping 10 fair coins?

What is the probability of getting a full house (XXXYY) when randomly drawing 5 cards out of a deck of 52 cards?

(3) (conditional probability)

If your friend flipped a fair coin three times, and tell you that one of the tosses resulted in head, what is the probability that all three tosses resulted in heads?

(4) (Bayes theorem)

A program selects a random integer X like this: a random bit is first generated uniformly. If the bit is 0, X is drawn uniformly from {0, 1, . . . , 7}; otherwise, X is drawn uniformly from {0, −1, −2, −3}.

If we get an X from the program with |X| = 1, what is the probability that X is negative?

(5) (union/intersection)

If P (A) = 0.3 and P (B) = 0.4,

what is the maximum possible value of P (A ∩ B)?

what is the minimum possible value of P (A ∩ B)?

what is the maximum possible value of P (A ∪ B)?

what is the minimum possible value of P (A ∪ B)?

(6) (mean/variance) Let mean X = 1

N

X

n=1

X_n and variance σ²_X= 1 N − 1

N

X

n=1

(X_n− X)². Prove that

σ_X² = N N − 1

1 N

N

X

n=1

X_n²− X²

! .

(7) (Gaussian distribution)

If X₁ and X₂ are independent random variables, where p(X₁) is Gaussian with mean 2 and variance 1, p(X₂) is Gaussian with mean −3 and variance 4. Let Z = X₁+X₂. Prove p(Z) is Gaussian, and determine its mean and variance.

2 Linear Algebra

(1) (rank)

What is the rank of





1 2 1 1 0 3 1 1 2



?

1 of 3

(2)

(2) (inverse)

What is the inverse of





0 2 4 2 4 2 3 3 1



?

(3) (eigenvalues/eigenvectors)

What are the eigenvalues and eigenvectors of





3 1 1

2 4 2

−1 −1 1



 ?

(4) (singular value decomposition)

For a real matrix M , let M = U ΣV^T be its singular value decomposition. Define M^† = V Σ^†U^T, where Σ^†[i][j] =_Σ[i][j]¹ when Σ[i][j] is nonzero, and 0 otherwise. Prove that M M^†M = M . (5) (PD/PSD)

A symmetric real matrix A is positive definite (PD) iff x^TAx > 0 for all x 6= 0, and positive semi-definite (PSD) if “>” is changed to “≥”. Prove:

(a) For any real matrix Z, ZZ^T is PSD.

(b) A is PD iff all eigenvalues of A are strictly positive.

(6) (inner product)

Consider x ∈ R^d and some u ∈ R^d with kuk = 1.

What is the maximum value of u^Tx?

What is the minimum value of u^Tx?

What is the minimum value of |u^Tx|?

(7) (distance)

Consider two parallel hyperplanes in R^d:

H₁: w^Tx = +3, H2: w^Tx = −2,

where w is the norm vector. What is the distance between H1 and H2?

3 Calculus

(1) (differential)

Let f (x) = ln(1 + e^−2x). What is df (x) dx ? (2) (partial differential)

Let f (x, y) = e^x+ e^2y+ e^3xy². What is ∂f (x, y)

∂y ? (3) (chain rule)

Let f (x, y) = xy, x(u, v) = cos(u + v), y(u, v) = sin(u − v). What is ∂f

∂v? (4) (integral)

What is Z 10

5

2 x − 3dx?

(5) (gradient and Hessian)

Let E(u, v) = (ue^v−2ve^−u)². Calculate the gradient ∇E and the Hessian ∇²E at u = 1 and v = 1.

2 of 3

(3)

(6) (optimization)

For some given A > 0, B > 0, solve

minα Ae^α+ Be^−2α.

3 of 3