• 沒有找到結果。

A Proof of Lemma 3

N/A
N/A
Protected

Academic year: 2022

Share "A Proof of Lemma 3"

Copied!
3
0
0

加載中.... (立即查看全文)

全文

(1)

A Proof of Lemma 3

Using the notation ˆv = Yn−1v/kYn−1vk and An = xnx>n, one can follow the analysis in Balsubramani et al.

(2013) to show that Φ(v)n ≤ Φ(v)n−1+ βn− Zn, with

• βn= 5γn2+ 2γ3n,

• Zn = 2γn(ˆv>U U>Anv − kUˆ >vkˆ 2>Anv), andˆ

• E [Zn|Fn−1] ≥ 2γn(λ − ˆλ)Φ(v)n−1(1 − Φ(v)n−1) ≥ 0.

We omit the proof here as the adaptation is straightforward.

It remains to show our better bound on |Zn|. For this, note that

|Zn| ≤ 2γn

ˆv>U U>− kU>vkˆ 2>

· kAnvk,ˆ where kAnvk ≤ 1 andˆ

ˆv>U U>− kU>vkˆ 2>

2

= kU>vkˆ 2− 2kU>vkˆ 4+ kU>vkˆ 4

= kU>vkˆ 2 1 − kU>vkˆ 2 .

As kU>vkˆ 2≤ 1 and 1 − kU>vkˆ 2 = Φ(v)n−1, we have

|Zn| ≤ 2γn

q Φ(v)n−1.

B Proof of Lemma 4

Assume that the event Γ0 holds and consider any n ∈ [n0, n1). We need the following, which we prove in Ap- pendix B.1.

Proposition 1. For any n > m and any v ∈ Rk, kU>Ynvk

kYnk ≥m n

3c

· kU>Ymvk kYmk .

From Proposition 1, we know that for any v ∈ S, kU>Ynvk

kYnvk ≥kU>Ynvk kYnk ≥n0

n

3ckU>Y0vk kY0k , where (n0/n)3c ≥ (n0/n1)3c≥ (1/c1)3cfor the constant c1 given in Remark 1. As Y0 = Q0 and kQ0k = 1 = kQ0vk, we obtain

kU>Ynvk

kYnvk ≥kU>Q0vk c3c1 kQ0vk ≥

√1 − ρ0

c3c1 = r c¯

c6c1 kd. Therefore, assuming Γ0, we always have

Φn = max

v



1 −kU>Ynvk2 kYnvk2



≤ 1 − c¯

c6c1 kd = ρ1.

B.1 Proof of Proposition 1

Recall that for any n, Yn = Yn−1+ γnxnx>nYn−1 and kxnx>nk ≤ 1. Then for any v ∈ Rk,

kU>Ynvk

kYnk ≥ kU>Yn−1vk − γnkU>Yn−1vk kYn−1k + γnkYn−1k , which is

1 − γn

1 + γn ·kU>Yn−1vk

kYn−1k ≥ e−3γnkU>Yn−1vk kYn−1k , using the fact that 1−x ≥ e−2xfor x ≤ 1/2 and γn≤ 1/2.

Then by induction, we have kU>Ynvk

kYnk ≥ e−3Pnt>mγi·kU>Ymvk kYmk . The Proposition follows as

e−3Pnt>mγi= e−3cPnt>m1t ≥m n

3c

using the fact thatPn t>m

1 t ≤Rn

m 1

xdx = ln(mn).

C Proof of Lemma 5

According to Lemma 3, our Φ(v)n ’s satisfy the same recur- rence relation as the functions Ψn’s of Balsubramani et al.

(2013). We can therefore have the following, which we prove in Appendix C.1.

Lemma 9. Let ˆρi = ρi/de5/c0ec0(1−ρi). Then for anyu ∈ S and αi≥ 12c2/ni−1,

Pr

 sup

n≥ni

Φ(u)n ≥ ˆρi+ αi| Γi



≤ e−Ω((α2i/(c2ρi))ni−1).

Our goal is to bound Pr [¬Γi+1i], which is

Pr

"

∃v ∈ S : sup

ni≤n<ni+1

Φ(v)n ≥ ρi+1i

# .

As discussed before, we cannot directly apply a union bound on the bound in Lemma 9 as there are infinitely many v’s in S. Instead, we look for a small “-net” Di

of S, with the property that any v ∈ S has some u ∈ Di

with kv − uk ≤ . Such a Diwith |Di| ≤ (1/)O(k)is known to exist (see e.g. Milman and Schechtman (1986)).

Then what we need is that when v and u are close, Φ(v)n and Φ(u)n are close as well. This is guaranteed by the following, which we prove in Appendix C.2.

Lemma 10. Suppose Γi happens. Then for any n ∈ [ni, ni+1), any  ≤ √

1 − ρi/(2c6c1 ), and any u, v ∈ S withku − vk ≤ , we have

Φ(v)n − Φ(u)n

≤ 16c6c1 /p 1 − ρi.

(2)

According to this, we can choose αi = (ρi+1− ˆρi)/2 and

 = αi

√1 − ρi/(16c6c1 ) so that with ku − vk ≤ , we have

(v)n − Φ(u)n | ≤ αi. This means that given any v ∈ S with Φ(v)n ≥ ρi+1, there exists some u ∈ Diwith Φ(u)n ≥ ρi+1− αi= ˆρi+ αi. As a result, we can now apply a union bound over Diand have

Pr [¬Γi+1i] ≤ X

u∈Di

Pr

 sup

n≥ni

Φ(u)n ≥ ˆρi+ αi| Γi

 . (7) To bound this further, consider the following two cases.

First, for the case of i < π1, we have ρi ≥ 3/4 and ηi = 1 − ρi≤ 1/4, so that

ˆ

ρi≤ ρie−5(1−ρi)= (1 − ηi)e−5ηi ≤ e−6ηi ≤ 1 − 3ηi. Then αi ≥ ((1 − 2ηi) − (1 − 3ηi)) /2 = ηi/2, which is at least 12c2/ni−1, as ηi ≥ η1 ≥ ¯c/(c6c1 kd) and ni−1 ≥ n0 = ˆcck3d2log d for a large enough constant ˆc. There- fore, we can apply Lemma 9 and the bound in (7) becomes

(cc1i)O(k)e−Ω((ηi2/c2)ni−1)≤ δ0

2(i + 1)2. Next, for the case of i ≥ π1, we have ρi≤ 3/4 so that

ˆ

ρi≤ ρi/de5/c0ec0/4≤ ρi/de5/c0e3,

as c0 ≥ 12 by assumption. Since ρi+1 ≥ ρi/de5/c0e2, this gives us αi ≥ ρi(de5/c0e−2− de5/c0e−3)/2, which is at least 12c2/ni−1, as ρi, according to our choice, is about c2(c3k log ni−1)/(ni−1+1) for a large enough constant c2. Thus, we can apply Lemma 9 and the bound in (7) becomes

(cc1i)O(k)e−Ω((ρi/c2)ni−1)≤ δ0

2(i + 1)2. (8) This completes the proof of Lemma 5.

C.1 Proof of Lemma 9

By Lemma 3, the random variables Φ(v)n ’s satisfy the same recurrence relation of Balsubramani et al. (2013) for their random variables Φn’s. Thus, we can follow their analy- sis1, but use our better bound on |Zn|, and have the follow- ing.

First, when given Γi, we have |Zn| ≤ 2γn

√ρifor ni−1≤ n < ni. Then one can easily modify the analysis in Bal- subramani et al. (2013) to show that for any t ≥ 0,

E h

e(v)nii

i≤ exp



t ˆρi+ c2(6t + 2t2ρi)

 1 ni−1

− 1 ni



,

by noting that (ni+ 1)/(ni−1+ 1) = de5/c0e and n ≥ n0= ˆcck3d2log d according to our choice of parameters.

1In particular, their proofs for Lemma 2.9 and Lemma 2.10.

Next, following Balsubramani et al. (2013) and applying Doob’s martingale inequality, we obtain

Pr

 sup

n≥ni

Φ(v)n ≥ ˆρi+ αii



≤ E

he(v)nii

iexp



−t(ˆρi+ αi) +c2 ni

(6t + 2t2ρi)



≤ exp



−tαi+ c2 ni−1

(6t + 2t2ρi)



≤ exp



−tαi

2 +2c2t2ρi ni−1

 ,

as αin12c2

i−1. Finally, by choosing t = α8cin2i−1ρi , we have the lemma.

C.2 Proof of Lemma 10

Assume without loss of generality that Φ(v)n ≤ Φ(u)n (oth- erwise, we switch v and u), so that

Φ(v)n − Φ(u)n

=kU>Ynvk2

kYnvk2 −kU>Ynuk2 kYnuk2 . As kv − uk ≤ , we have

kU>Ynvk

kYnvk ≤kU>Ynuk + kU>Ynk

kYnuk − kYnk . (9) To relate this to kUkY>Ynuk2

nuk2 , we would like to express kU>Ynk in terms of kU>Ynuk and kYnk in terms of kYnuk. For this, note that both kU>Ynuk/kU>Ynk and kYnuk/kYnk are at least kU>Ynuk/kYnk, which by Proposition 1 is at least

ni−1

n

3ckU>Yni−1uk

kYni−1k ≥ c−6c1 kU>Yni−1uk kYni−1k , (10) using the fact that ni−1/n ≥ ni−1/ni+1 ≥ 1/c21. Then as Yni−1 = Qni−1 and kQni−1k = kQni−1uk, the righthand side of (10) becomes

c−6c1 kU>Qni−1uk kQni−1uk = c−6c1

q

1 − Φ(u)ni−1 ≥ c−6c1 p 1 − ρi, given Γi. What we have obtained so far is a lower bound for both kU>Ynuk/kU>Ynk and kYnuk/kYnk. Plugging this into (9), with ˆ = c6c1 /√

1 − ρi, we get kU>Ynvk

kYnvk ≤ kU>Ynuk(1 + ˆ) kYnuk(1 − ˆ) . As a result, we have

Φ(v)n − Φ(u)n

≤ kU>Ynuk2 kYnuk2

 (1 + ˆ)2 (1 − ˆ)2 − 1



≤ 16ˆ,

since (1+ˆ(1−ˆ))22 − 1 ≤ (1−ˆ)2 ≤ 16ˆ for ˆ ≤ 1/2.

(3)

D Proof of Lemma 7

As cos(U, Qi−1)2 = 1+tan(U,Q1

i−1)21+ε12 i−1

≥ βi2, we have kGik ≤ 4βi≤ 4 cos(U, Qi−1). Thus, we can apply Lemma 6 and have

tan(U, AQi−1+ Gi) ≤ max(βi, max(βi, γ)εi−1), which is at most max(βi, γεi−1) ≤ γεi−1 = εi. The lemma follows as tan(U, Qi) = tan(U, AQi−1+ Gi).

E Proof of Lemma 8

Let ρ = 4βiand note that kGik ≤ kA − Fik, where Fiis the average of |Ii| i.i.d. random matrices, each with mean A. Recall that kAk ≤ 1 by Assumption 1. Then from a matrix Chernoff bound, we have

Pr [kGik > ρ] ≤ Pr [kA − Fik > ρ] ≤ de−Ω(ρ2|Ii|)≤ δi, for |Ii| given in (3).

F Proof of Lemma 9

Let L be the iteration number such that εL−1> ε and εL≤ ε. Note that with εL = ε0γL = ε0(1 − (λ − ¯λ)/λ)L/4 ≤ ε0e−L(λ−¯λ)/(4λ), we can have

L ≤ O

 λ

λ − ¯λlogε0 ε



≤ O

 λ

λ − ¯λlogd ε

 .

As the number of samples in iteration i is

|Ii| = O

 log(d/δi) (λ − ¯λ)2βi2



≤ O

 log(di) (λ − ¯λ)2β2i

 , the total number of samples needed is

L

X

i=1

|Ii| ≤ O log(dL) (λ − ¯λ)2



·

L

X

i=1

1 βi2. With βi = min(γ/q

1 + ε2i−1, γεi−1), one sees that for some i0≤ O(log d), βi = γ/q

1 + ε2i−1when i ≤ i0and βi= γεi−1= εiwhen i > i0. This implies that

L

X

i=1

1 βi2 =

i0

X

i=1

1 + ε2i−1 γ2 +

L

X

i=i0+1

1

ε2i, (11) where the first sum in the righthand side of (11) is

i0 γ2 +

i0

X

i=1

ε20γ2i−4≤ O(log d)

γ2 + ε20 γ2(1 − γ2), while the second sum is

L

X

i=i0+1

γ2(L−i)

ε2L ≤ 1

(1 − γ22L ≤ 1 γ2(1 − γ22

using the fact that εL = γεL−1 ≥ γε. Since γ2 =



1 − λ−¯λλ1/2

≤ 1 − λ−¯λ, we have 1−γ12λ−¯λ, and since λ ≤ O(¯λ), we also have γ12 ≤ O(1). Moreover, as we assume that ε ≤ 1/√

kd, we can conclude that the total number of samples needed is at most

L

X

i=1

|Ii| ≤ O log(dL) (λ − ¯λ)2



·O

 λ

(λ − ¯λ)ε2



≤ O λ log(dL) ε2(λ − ¯λ)3

 .

References

Balsubramani, A., Dasgupta, S., and Freund, Y. (2013).

The fast convergence of incremental pca. In Advances in Neural Information Processing Systems.

Milman, V. D. and Schechtman, G. (1986). Asymptotic the- ory of finite-dimensional normed spaces. Lecture Notes in Mathematics. Springer.

參考文獻

相關文件

The proof of this Corollary follows from the Poincare Lemma and Proposition

As in the proof of Green’s Theorem, we prove the Divergence Theorem for more general regions by pasting smaller regions together along common faces... Thus, when we add the

By the similar reasoning of pumping lemma, if some non-S variable appears at least twice in a path from root, then the looping part between the two occurrences can be repeated as

Juang has received numerous distinctions and recognitions, including Bell Labs' President Gold Award, IEEE Signal Processing Society Technical Achievement Award, the IEEE

• Each row corresponds to one truth assignment of the n variables and records the truth value of φ under that truth assignment. • A truth table can be used to prove if two

6 《中論·觀因緣品》,《佛藏要籍選刊》第 9 冊,上海古籍出版社 1994 年版,第 1

In the proof of Theorem 5.6.2, one can replace the condition of pointwise compactness by that f n converges on a countable dense subset of

Lemma 2.1.6 All reduced residue systems modulo m will contain the same number φ(m), of