• 沒有找到結果。

Logarithmic Sobolev constants for some finite Markov chains

From the computation of the logarithmic Sobolev constants in Section 2.4, one can see that different models need different tricks. In this chapter, we concentrate on the calculation of the logarithmic Sobolev constant for the simple random walks on the n-cycle. In Section 3.1, we focus on the even cycles and explicitly determine their logarithmic Sobolev constants. Thereafter, an application for collapsing a cycle is introduced. In Section 3.2, we implement another trick to determine the logarithmic Sobolev constant of the 5-cycle.

3.1 The simple random walk on an even cycle

For n ≥ 2, consider a simple random walk on the n-cycle Zn = {1, 2, ..., n}. Clearly, the corresponding Markov kernel Kn is given by Kn(x, x ± 1) = 12 and the uniform distribution on Zn is its unique stationary distribution.(For n = 2, we consider the case K(1, 2) = K(2, 1) = 1 and K(1, 1) = K(2, 2) = 0. By Corollary 2.7, α = λ2 = 1.) Throughout this section, we assume that n ≥ 3.

Let λn and αn be the spectral gap and logarithmic Sobolev constant of Kn. It has been shown in Example 1.1 that λn= 1 − cos(2π/n) and in Corollary 2.8 and Theorem 2.5 that

α3 = 1

2 log 2 < λ3 2 = 3

4, α4 = λ4 2 = 1

2.

3.1.1 The main result

The following is the main result of this section. This is a joint work with Yuan-Chung Sheu and has been polished in [6].

Theorem 3.1. For n > 2, let Kn be the Markov kernel of the simple random walk on the n-cycle. Assume that n is even. Then the spectral gap λn= λ(Kn) and the logarithmic Sobolev constant αn= α(Kn) satisfy αn= λn/2 = 12(1 − cosn).

The following is a simple application of the above theorem.

Corollary 3.1. For n ≥ 3, let Kn be a Markov kernel on Zn defined by Kn(i, i − 1) = p, Kn(i, i) = r and Kn(i, i + 1) = q for i ∈ Zn, where p + q + r = 1. Then the spectral gap λn and the logarithmic Sobolev constant αn satisfy αn = λn/2 =

1−r

2 (1 − cosn)

Proof. Let eKn be the Markov of the simple random walk on Zn and E and eE be the Dirichlet forms of Kn and eKn. Obviously, both Kn and eKn have the same stationary distribution, the uniform distribution on Zn. By Lemma 1.1, one has E(f, f ) = (1 − r) eE(f, f ) for any function f on Zn and then, by definition, λn= (1 − r)eλn and αn= (1 − r)eαn.

We will prove this theorem in the next subsection. Here, we consider first the ratio E(f, f )/L(f ) and, by studying the Dirichlet form, restrict the minimizer, if any, for the logarithmic Sobolev constant to a specific class of functions. For any function f = (f (1), ..., f (n)) = (x1, ..., xn), we have

L(f ) = 1 n

Xn i=1

x2i log x2i

kf k22 (3.1)

and

E(f, f ) = 1

2n(|x1− x2|2+ |x2− x3|2+ · · · + |xn−1− xn|2+ |xn− x1|2). (3.2)

It is obvious that the uniformity of the stationary distribution πn of Kn implies the invariance of L(f ) under the permutation of the components of f . We now investigate the extreme value of E over all permutations on the components of f .

Consider the function

F (x) = |x1− x2|2 + |x2− x3|2+ · · · + |xn−1− xn|2 + |xn− x1|2 (3.3)

where x = (x1, x2, ..., xn) ∈ Rn. To every x = (x1, x2, ..., xn) with 0 ≤ x1 ≤ x2

· · · ≤ xn, there corresponds an element ex ∈ Rn given by the formula

e x =







(x1, x3, x5, ..., x2k+1, x2k, ..., x4, x2) if n = 2k + 1 (x1, x3, x5, ..., x2k−1, x2k, ..., x4, x2) if n = 2k.

(3.4)

Denote by Snthe set of all permutations on {1, 2, ..., n} and write θx = (xθ(1), xθ(2), ..., xθ(n)) for θ ∈ Sn and x ∈ Rn.

Proposition 3.1. For every x = (x1, x2, ..., xn) with 0 ≤ x1 ≤ x2 ≤ · · · ≤ xn, we have F (θx) ≥ F (ex) for all θ ∈ Sn.

Proof. We prove this by induction on n. There is nothing to prove in the case n = 2. Assume that it is also true for n = k. We consider the case n = k + 1 and fix x = (x1, x2, ..., xk+1) where 0 ≤ x1 ≤ x2 ≤ · · · ≤ xk+1.

Step1. Set y = (x1, x2, ..., xk) and consider the corresponding vector ey given by (3.4). For every i = 1, 2, ..., k − 2, set

e yi,i+2 =







(x1, x3, ..., xi, xk+1, xi+2, ..., x4, x2) if i is odd (x1, x3, ..., xi+2, xk+1, xi, ..., x4, x2) if i is even.

(3.5)

Thus eyi,i+2 is obtained by inserting xk+1 in ey between xi and xi+2. We also set

e

y1,2 = (x1, x3, ..., x4, x2, xk+1)

and

e

yk−1,k =







(x1, x3, ..., xk, xk+1, xk−1, ..., x4, x2) if k is odd (x1, x3, ..., xk−1, xk+1, xk, ..., x4, x2) if k is even.

(3.6)

We claim that

F (ey1,2) ≥ F (eyk−1,k) (3.7) and

F (eyi,i+2) ≥ F (eyk−1,k) for all i = 1, 2, ..., k − 2. (3.8) Note that for 1 ≤ i ≤ k − 2, a simple computation shows

F (eyi,i+2) = F (ey) + (xi− xk+1)2+ (xk+1− xi+2)2− (xi− xi+2)2. (3.9)

Therefore for 1 ≤ i ≤ k − 4, we get

F (eyi,i+2) − F (eyi+2,i+4) = [(xi− xk+1)2+ (xk+1− xi+2)2− (xi− xi+2)2]

− [(xi+2− xk+1)2+ (xk+1− xi+4)2 − (xi+2− xi+4)2]

= 2(xk+1− xi+2)(xi+4− xi) ≥ 0.

(3.10) Besides, we also have

F (eyk−2,k) − F (eyk−1,k) = [(xk+1− xk−2)2 + (xk+1− xk)2− (xk−2− xk)2]

− [(xk+1− xk−1)2+ (xk+1− xk)2− (xk− xk−1)2]

= 2(xk+1− xk)(xk−1− xk−2) ≥ 0

(3.11)

and

F (eyk−3,k−1) − F (eyk−1,k) = 2(xk+1− xk−1)(xk− xk−3) ≥ 0. (3.12) Combining (3.10), (3.11) and (3.12) gives (3.8). To prove (3.7), it suffices to show

that F (ey1,2) ≥ F (ey1,3), whereas this follows easily from the following computation.

F (ey1,2) − F (ey1,3) = [(x1− xk+1)2+ (xk+1− x2)2− (x1− x2)2]

− [(x1− xk+1)2+ (xk+1− x3)2− (x1 − x3)2]

= 2(xk+1− x1)(x3− x2) ≥ 0.

Step2. We prove that for every θ ∈ Sn+1,

F (θx) ≥ F (eyk−1,k) = F (ex). (3.13)

Fix θ ∈ Sn+1 and set c = θx. It loses no generality to write c = (..., xi, xk+1, xj, ...) for some i < j and let z = (..., xi, xj, ...) ∈ Rn be obtained by removing the component xk+1 from the vector c. Then, for 1 ≤ j ≤ k − 2, we have

F (c) − F (eyj,j+2)

= [F (z) + (xi− xk+1)2+ (xj − xk+1)2− (xi− xj)2]

− [F (ey) + (xj − xk+1)2 + (xk+1− xj+2)2 − (xj − xj+2)2]

= F (z) − F (ey) + 2(xk+1− xj)(xj+2− xi) ≥ 0.

(3.14)

(The last inequality applies the inductive assumption that F (z) ≥ F (ey).) For j = k − 1, we have

F (c) − F (eyk−1,k)

= [F (z) + (xi− xk+1)2+ (xk−1− xk+1)2− (xi− xk−1)2]

− [F (ey) + (xk− xk+1)2+ (xk+1− xk−1)2− (xk− xk−1)2]

= F (z) − F (ey) + 2(xk− xi)(xk+1− xk−1) ≥ 0.

For j = k, we have

F (c) − F (eyk−1,k)

= [F (z) + (xk− xk+1)2+ (xi− xk+1)2 − (xi− xk)2]

− [F (ey) + (xk− xk+1)2+ (xk+1− xk−1)2 − (xk− xk−1)2]

= F (z) − F (ey) + 2(xk−1− xi)(xk+1− xk) ≥ 0.

(3.15)

Therefore (3.13) follows from (3.14)-(3.15) and (3.8).

The following is a consequence of Proposition 3.1 and is critical in computing the logarithmic Sobolev constant of the simple random walk on the n-cycle.

Corollary 3.2. For n ≥ 3, let αn be the logarithmic Sobolev constant of the simple random walk on the n cycle. Assume that there exists a positive non-constant function f such that αn = E(f, f )/L(f ). Let 0 ≤ x1 ≤ x2 ≤ · · · ≤ xn be the components of f and ef = (x1, x3, ..., x4, x2). Then the Euler-Lagrange equation (2.3) is satisfied with α = αn and u = ef . Furthermore, αn = E( ef , ef )/L( ef ).

3.1.2 Proof of Theorem 3.1

In this subsection, we dedicate in proving Theorem 3.1. Throughout this section, n is even and n ≥ 4. The way we prove Theorem 3.1 is first to verify by contradiction that there is no positive non-constant function u and α < λ2n satisfying the Euler-Lagrange equation (2.3). Our main result then follows from Corollary 2.4. Before starting to prove the main result, we derive a series of lemmas using combinatorial arguments.

Define the shifting operator σ by

σ(x1, x2, ..., xn) = (xn, x1, x2, ..., xn−1), (3.16)

where x = (x1, x2, ..., xn) ∈ Rn. Set σj(x) = σ(σj−1(x)) for j ≥ 2 and write σ−j for the inverse of σj.

Lemma 3.1. Consider a vector of the form

u = (x1, x3, ..., x2k−1, x2k, ..., x4, x2)

where x1 ≤ x2 ≤ ... ≤ x2k and write σj(u) = ((σj(u))1, (σj(u))2, ..., (σj(u))2k).

Then for every 1 ≤ j ≤ k − 1, we have

j(u))i ≤ (σj(u))2k−i+1, for i = 1, ..., k (3.17)

and

−j(u))i ≥ (σ−j(u))2k−i+1, for i = 1, ..., k. (3.18)

Proof. Assume 1 ≤ j ≤ k − 1. Then we have

j(u))i =















x2(j−i+1), if 1 ≤ i ≤ j;

x2(i−j)−1, if j + 1 ≤ i ≤ j + k;

x2k−2[i−(j+k+1)], if j + k + 1 ≤ i ≤ 2k.

Case 1: 1 ≤ i ≤ j ∧ (k − j). Since i ≤ (k − j), we get 2k − i + 1 ≥ k + j + 1 and (σj(u))2k−i+1 = x2(i+j), which implies

j(u))i = x2(j−i+1) ≤ x2(i+j) = (σj(u))2k−i+1.

Case 2: j ∨ (k − j) < i ≤ k. Note that (k − j) < i ≤ k implies k + 1 ≤ (2k − i + 1) ≤ (k + j). Hence, we have

j(u))i = x2(i−j)−1, j(u))2k−i+1= x2(2k−i−j)+1.

Since 2(2k − i − j) + 1 ≥ 2(i − j) − 1, we get (σj(u))i ≤ (σj(u))2k−i+1.

Case 3: j ∧ (k − j) < i ≤ j ∨ (k − j). It is obvious that only the situation j 6= k −j is needed to be considered. On one hand, if j < k −j, then j < i ≤ (k −j) and 2k − i + 1 ≥ j − k + 2k + 1 = k + j + 1. This implies

j(u))i = x2(i−j)−1 ≤ x2(i+j) = (σj(u))2k−i+1.

On the other hand, if k − j < j, then k − j < i ≤ j. By this fact, we have

j(u))i = x2(j−i+1) ≤ x2(2k−i−j)+1 = (σj(u))2k−i+1.

Combining all above proves (3.17). The proof of (3.18) can be done by similar arguments.

Lemma 3.2. Let u = (u1, u2, ..., u2k−1, u2k) be a vector with ui > 0 for all 1 ≤ i ≤ 2k. Assume that there exist two positive constants, c and d, such that

2ui− (ui−1+ ui+1) = cuilog du2i (3.19)

for all i = 1, ..., 2k.(Here we write u0 = u2k and u2k+1 = u1.) Then:

(a) If ui ≤ u2k−i+1 for all 1 ≤ i ≤ k, then we have

u21− u22k+ u2k− u2k+1 ≥ c[(u21+ · · · + u2k) − (u2k+1+ · · · + u22k)].

(b) If ui ≥ u2k−i+1 for all 1 ≤ i ≤ k, then we have

u22k− u21+ u2k+1− u2k ≥ c[(u2k+1+ · · · + u22k) − (u21+ · · · + u2k)].

Proof. For (a), assume that ui ≤ u2k−i+1 for all 1 ≤ i ≤ k. For every 1 ≤ i ≤ k, we rewrite (3.19) as

2 − ui−1+ ui+1

ui = c log du2i.

Then a simple computation implies

Hence, by (3.20), we have

(uiu2k−i+2− ui−1u2k−i+1) + (uiu2k−i− ui+1u2k−i+1) ≥ c(u2i − u22k−i+1)

for all i = 1, ..., k. The desired identity is obtained by summing up both sides of the above inequality over all 1 ≤ i ≤ k.

For (b), assume that ui ≥ u2k−i+1for all 1 ≤ i ≤ k. For every i, set vi = u2k−i+1. Then our result follows by applying (a) to the vector v = (v1, v2, ..., v2k).

Lemma 3.3. Consider the following k × k matrices:

A =

and

The proof of (b) is the same as that of (a) except the replacement of θl with

2lπ 2k+1.

Lemma 3.4. (a) Consider the following system of inequalities:







Aj − Aj+1 ≥ 4t(A1+ · · · + Aj), j = 1, ..., k − 1 Ak ≥ 2t(A1+ · · · + Ak).

(3.21)

If t < 12(1 − cos2kπ ), then the system (3.21) has no solution (A1, A2, ..., Ak) with A1 < 0.

(b) Consider the following system of inequalities:







Aj − Aj+1 ≥ 4t(A1+ · · · + Aj), j = 1, ..., k − 1 Ak ≥ 4t(A1+ · · · + Ak).

(3.22)

If t < 12(1 − cos2k+1π ), then the system (3.22) has no solution (A1, A2, ..., Ak) with A1 < 0.

Proof. For (a), let f1(t) = 2 − 4t and g1(t) = 4t. For every 1 ≤ l ≤ k − 1, put

fl+1(t) = (1 − 4t)fl(t) − gl(t) (3.23)

and

gl+1(t) = 4tfl(t) + gl(t). (3.24) Clearly, (3.23) and (3.24) imply

gl+1(t) − gl(t) = 4tfl(t) = fl(t) − gl(t) − fl+1(t).

Hence we have fl(t) = gl+1(t) + fl+1(t) for 1 ≤ l ≤ k − 1. By this fact, we obtain, for 2 ≤ l ≤ k − 1,

fl+1(t) = (2 − 4t)fl(t) − (fl(t) + gl(t))

= (2 − 4t)fl(t) − fl−1(t).

Note that

f1(t) = 2 − 4t, f2(t) = (1 − 4t)f1(t) − g1(t) = (2 − 4t)2− 2,

and, therefore,

fl(t) = det(Ml− 4tIl), 1 ≤ l ≤ k (3.25) where Il is the l × l identity matrix and Ml is the l × l matrix of the same form as that in Lemma 3.3(a).

Assume that t < 12(1 − cos2kπ) and (A1, A2, ..., Ak) satisfies the system of in-equalities (3.21). Since t < 12(1 − cos2lπ) for 1 ≤ l ≤ k, Lemma 3.3(a) and (3.25) imply that fl(t) > 0 for all l = 1, 2, ..., k.

For 1 ≤ i ≤ k − 1, we have, by (3.21),

Ak−i− Ak−i+1 ≥ 4t(A1+ · · · + Ak−i).

We claim that

fj(t)Ak−j+1 ≥ gj(t)(A1+ · · · + Ak−j), ∀1 ≤ j ≤ k. (3.26)

Clearly (3.26) holds for j = 1. Suppose that it also holds for some i with 1 ≤ i ≤ k − 1. Since fi(t) > 0, we get

fi(t)Ak−i = fi(t)(Ak−i− Ak−i+1) + fi(t)Ak−i+1 ≥ (4tfi(t) + gi(t))(A1+ · · · + Ak−i)

= gi+1(t)(A1+ · · · + Ak−i−1) + (4tfi(t) + gi(t))Ak−i.

The above inequality implies that (3.26) also holds for j = i + 1 and hence is true for 1 ≤ j ≤ k. Plugging j = k into (3.26) gives fk(t)A1 ≥ 0. Since fk(t) > 0, we have A1 ≥ 0. This proves part (a).

The same line of reasoning as above applies for part (b) and the proof goes word for word except the replacement of f1(t) with 1 − 4t.

Proof of Theorem 3.1. By Corollary 2.4, it suffices to show that there is no positive non-constant function u and 0 < β < λn/2 satisfying (2.4). We prove this fact by contradiction. Suppose the inverse, that is, (2.4) is satisfied for some β < λ2 =

1

2(1 − cosn) and a positive non-constant function u. Without loss of generality, we assume P

i(u(i))2 = 1. By Corollary 3.2, we may assume further that u = (x1, x3, ..., xn−1, xn, ..., x4, x2), where 0 < x1 ≤ x2 ≤ · · · ≤ xn and x1 < xn. In this case, the Euler-Lagrange equation in (2.4) is given by

2xi− (x(1)i + x(2)i ) = 2βxilog nx2i, 1 ≤ i ≤ n.

where x(1)i and x(2)i are the two nearest neighbors of xi.

Recall the shifting operator σ defined in (3.16) and σj = σ(σj−1) for j ≥ 2.

Note that we may write n = 4k or n = 4k + 2. For j = 1, ..., k, we have σj(f ) = (x2j, ..., x2, x1, ..., xn−2j−1, xn−2j+1, ..., xn−1, xn, ..., x2j+2) and

σ−j(f ) = (x2j+1, ..., xn−1, xn, ..., xn−2j+2, xn−2j, ..., x2, x1, ..., x2j−1).

By Lemma 3.1 and Lemma 3.2(a), we get

(x22j − x22j+2+ x2n−2j−1− x2n−2j+1)

≥2β[(x22+ x24+ · · · + x22j+ x21+ x23+ · · · + x2n−2j−1)

− (x2n−2j+1+ x2n−2j+3+ · · · + x2n−1+ x22j+2 + x22j+4+ · · · + x2n)].

Similarly Lemma 3.1 and Lemma 3.2(b) imply that

(x22j−1− x22j+1+ x2n−2j− x2n−2j+2)

≥2β[(x21+ x23+ · · · + x22j−1+ x22+ x24+ · · · + x2n−2j)

− (x22j+1+ x22j+3+ · · · + x2n−1+ x2n−2j+2+ x2n−2j+4+ · · · + x2n)].

Note that n − 2j − 1 ≥ 2j + 1 and n − 2j ≥ 2j + 2 for 1 ≤ j ≤ k. Summing up the above two inequalities gives

(x22j−1 + x22j − x22j+1− x22j+2) + (x2n−2j−1+ x2n−2j− x2n−2j+1− x2n−2j+2)

≥ 4β[(x21 + x22+ · · · + x22j) − (x2n−2j+1+ x2n−2j+2+ · · · + x2n)].

Letting Ai = x22i−1+ x22i− x2n−2i+1− x2n−2i+2 for 1 ≤ i ≤ k implies, for n = 4k,





Aj− Aj+1 ≥ 4β(A1+ A2+ · · · + Aj), j = 1, ..., k − 1 Ak≥ 2β(A1+ A2+ · · · + Ak)

and for n = 4k + 2,





Aj− Aj+1 ≥ 4β(A1+ A2+ · · · + Aj), j = 1, ..., k − 1 Ak≥ 4β(A1+ A2+ · · · + Ak)

Note that β < 12(1 − cosn) and A1 = x21 + x22− x2n−1− x2n ≤ x21− x2n < 0. This contradicts Lemma 3.4.

3.1.3 An application: collapse of cycles and product of sticks

In this section, we discuss some applications of Theorem 3.1. This is a joint work with Laurent Saloff-Coste and Wai-Wai Liu in [5]. We first consider the following two ways of collapsing even cycles.

1. Collapsing the 2n-cycle to the n-stick with loops at the ends. Fix n ≥ 2 and let K1 and K2 be Markov kernels on Zn and Z2n defined by

K1(0, 0) = K1(n − 1, n − 1) = K1(i, i + 1) = K1(i + 1, i) = 1/2, for all i = 0, ..., n − 2, and

K2(i, i + 1) = K2(i, i − 1) = 1/2, ∀i = 0, ..., 2n − 1.

Let p : Z2n → Zn be a surjective map defined by p(i) = p(2n − 1 − i) = i for i = 0, ..., n − 1. A simple computation(checking the requirement in Proposition 2.5) shows that the Markov kernel K2 collapses to K1 via the projection p. See Figure 3.1.

Figure 3.1: The 14-cycle collapses to the 7-stick with loops at the ends. All edges have weight 1/2. eigenvalue 1 − λ2 with multiplicity 2 and the two dimensional eigenspace contains the function f (x) = cos(πn(x + 12)), which has the property f (x) = f (2n − 1 − x).

2. Collapsing the 2n-cycle to the n + 1-stick with reflecting barriers. Fix n ≥ 2 and let K2 be the simple random walk on Z2n. Consider a Markov kernel K1

on Zn+1 given by K1(0, 1) = K1(n, n − 1) = 1 and K1(i, i + 1) = K1(i, i − 1) = 1/2 for all 1 ≤ i ≤ n − 1. Let p : Z2n → Zn+1 be a map defined by p(i) = p(2n − i) = i for 0 ≤ i ≤ n. Then K1 is obtained by collapsing K2 through the projection p.

See Figure 3.2.

Figure 3.2: The 14-cycle collapses to a 8-stick with reflecting barriers. All edges have weight 1/2 except those marked which have weight 1.

r r r r r r r

It is easy to check that f (x) = cosπxn is an eigenvector of K2 with corresponding eigenvalue cosπn. The same line of reasoning as in case 1 implies that both K1 and K2 have the same spectral gap. Then, by Proposition 2.5 and Theorem 3.1, we have the following proposition.

Proposition 3.3. Let n ≥ 2 and K be a Markov kernel on Zn defined by K(0, 1) = K(n − 1, n − 2) = 1 and K(i, i + 1) = K(i, i − 1) = 1/2 for all 1 ≤ i ≤ n − 2. Then the spectral gap λ and the logarithmic Sobolev constant α are given by 2α = λ = 1 − cosn−1π .

Proof. Note that the case n = 2 is part of the result in Corollary 2.7 and the case n > 2 is given by the discussion in front of this proposition.

3. Product of sticks. In this case, we consider an application of Proposition 3.2. Fix d ≥ 1 and let b = (b1, ..., bd) be an integer vector, where bi ≥ 2 for all

1 ≤ i ≤ d. In Zd with basis {e1, ..., ed}, consider a rectangular box

Rb = {x = (x1, ..., xd) ∈ Zd: xi ∈ {1, ..., bi}, 1 ≤ i ≤ d}. (3.27) The first application deals with a Markov kernel K on Rb, where

∀x, y ∈ Rb, x 6= y, K(x, y) = 1

All edges have weight 1/4 except the corner loops which have weight 1/2. The stationary measure is uniform.

for all 0 ≤ j ≤ bi − 2. By Proposition 2.6, one may generalize Proposition 3.2 as follows.

Theorem 3.2. Let d ≥ 1 be an integer and b = (b1, ..., bd) be an integer vector with 2 ≤ b1 ≤ · · · ≤ bd. Let Rb be a subset of Zd defined in (3.27) and K be a

Markov kernel on Rb given by (3.28) and (3.29). Then the spectral gap λ and the logarithmic Sobolev constant α of K satisfy

2α = λ = 1 − cos(π/bd)

d .

3.2 The simple random walk on the 5-cycle

Referring to Theorem 3.1 and Corollary 2.7, the logarithmic Sobolev constant for the simple random walk on an even cycle is a half of the spectral gap but this is not true for the simple random walk on the 3-cycle. It is not sure how the spectral gap and the logarithmic Sobolev constant are related if the simple random walk is considered on an odd cycle. A numerical result for the cases n = 5, 7 and 9, where n denotes the n-cycle, shows that the logarithmic Sobolev constant should be a half of the spectral gap. However, a mathematical proof is not available yet.

The goal of this section is to clarify the fact α = λ/2 for the case n = 5, whereas a similar proof is proposed by Wai-Wai Liu, Laurent Saloff-Coste and the author of this dissertation.

Theorem 3.3. Let K be the Markov kernel of the simple random walk on the 5-cycle and λ and α be the spectral gap and the logarithmic Sobolev constant of K.

Then 2α = λ = 1 − cos5 .

Remark 3.1. In the section, what will be proved is a stronger result than the above theorem which says E(f, f ) ≥ λ2L(f ) for all functions f and the equality holds if and only if f is constant.

Before proving this theorem, we consider the following application.

Corollary 3.3. Let eK be a Markov kernel on Z3 given by

K(0, 0) = ee K(0, 1) = eK(1, 2) = eK(1, 0) = 1/2, K(2, 1) = 1,e

and eλ and eα be the spectral gap and logarithmic Sobolev constant of eK. Then 2eα = eλ = 1 − cos5 .

Proof. Let K be the Markov kernel of the simple random walk on the 5-cycle with spectral gap λ and logarithmic Sobolev constant α. Consider the map p : Z5 → Z3 defined by p(i) = p(4 − i) = i for i = 0, 1, 2. It is clear that K collapses to eK through p. See Figure 3.4.

Figure 3.4: The 5 cycle collapses to the 3-point stick with a loop at one end. All edges have weight 1/2 except marked otherwise.

r eigenvalue 1 − λ. It is easy to see that the function f |{0,1,2}is also an eigenfunction of eK. Thus λ = eλ = 1−cos5 and the identity eα = eλ/2 is then proved by Theorem 3.3 and Proposition 2.5.

To prove Theorem 3.3, we need the following two lemmas.

Lemma 3.5. Consider the function gβ(t) = 2t − 4βt log t for t > 0 and gβ(0) = 0.

Proof. Fix t > 0, β ≥ 0 and let h be a function on [0, t] defined by

Then the first derivative of h is given by h0(s) = 1 − 2s that h is strictly decreasing in [0, t] and hence proves this lemma since h(t) = 0.

Lemma 3.6. For β > 0, let gβ be the function defined in Lemma 3.5 and Dβ be

Remark 3.2. Note that F (β, t) is well-defined on Dβ since one has, by Lemma 3.5, (gβ(t) − s) − (gβ(s) − t) ≥ (t − s)

where f1(s, t) = log¡t+s

2

¢. This implies, by using Lemma 3.5 twice,

Fβ(s, t) > [gβ(t) − gβ(s) + t − s][2 − 4β − 4βf2(s, t)] − (t − s)

> (t − s) {[3 − 4β − 4βf1(s, t)][2 − 4β − 4βf2(s, t)] − 1} ,

(3.31)

where

f2(s, t) = log

µgβ(t) + gβ(s) − (t + s) 2

= log(t + s) − 4β(t log t + s log s)

2 .

Note that the second inequality in (3.31) uses the convexity of the map u 7→ u log u for u > 0 to get

f2(s, t) < log

µt + s 2

¶ + log

µ

1 − 4β log

µt + s 2

¶¶

, (3.32)

and then applies the fact r − 4βr log r < e1−2β for all r > 0 and 0 < β ≤ 12(1 − cos5 ). A simple computation shows that

(2 − 4β)(3 − 4β) − 1 = 16β2− 20β + 5 ≥ 0,

for 0 ≤ β ≤ 12(1 − cos5 ). To finish this proof, it suffices to show that

(2 − 4β)f1(s, t) + [3 − 4β − 4βf1(s, t)] f2(s, t) ≤ 0.

Since (t + s)/2 < 1, it remains to prove, by using (3.32), that

h(x) = (2 − 4β)x + (3 − 4β − 4βx)[x + log(1 − 4βx)] < 0, ∀x < 0.

Taking the first derivative of h, we get

h0(x) = 5 − 12β + 4β(4β − 2)

1 − 4βx − 4β[2x + log(1 − 4βx)]

> 5 − 12β + 4β(4β − 2) = 16β2− 20β + 5 ≥ 0,

where the first inequality is implies by the facts that the mapping x 7→ 4β(4β−2)1−4βx for x < 0 is decreasing and

∀x < 0, 2x + log(1 − 4βx) ≤ 2x(1 − 2β) < 0.

Therefore, h is strictly increasing. In addition to the fact h(0) = 0, we get h(x) < 0 for x < 0.

Proof of Theorem 3.3. By Proposition 2.2, one always has 0 < α ≤ λ = 12(1 − cos5 ). We prove this theorem by showing that there is no nonconstant solution u for the Euler-Lagrange equation

2αu log(u/kuk2) = (I − K)u.

Assume the inverse, that is, the above equation is satisfied with nonconstant u whose entries are 0 < x0 ≤ x1 ≤ x2 ≤ x3 ≤ x4. There is not loss of generality to assume that kuk2 = 1, or equivalently, x20 + x21 + · · · + x24 = 5. By Corollary 3.2, we may assume further that u = (x0, x2, x4, x3, x1). In the above setting, the minimizing equation is equal to

x1 + x2 = gα(x0), x0+ x3 = gα(x1), x0+ x4 = gα(x2), x1 + x4 = gα(x3), x2+ x3 = gα(x4),

(3.33)

where gα(x) = 2x − 4αx log x.

Note that the assumption of nonconstant u derives x0 < x4, and the nor-malization of u implies x0 < 1. Since gα is a concave function with derivative gα0(1) = 2 − 4α > 0, we have gα(x) ∈ (0, 2) for x ∈ (0, 1). On one hand, by this observation, the equality x1+ x2 = gα(x0) implies x1 < 1 and then the identity x0+ x3 = gα(x1) implies x0+ x3 ≤ 2. On the other hand, by (3.33), one can obtain the following equation

Fα(x0, x2) = gα(gα(x2) − x0) − gα(gα(x0) − x2) − (x2− x0) = 0.

Since u is a solution of (3.33), it is clear that (x0, x2) ∈ Dα, the region defined in (3.30). Thus, by Lemma 3.6, we have x0 = x1 = x2 and, by the first equality of (3.33), we get x1 = 1. This contradicts x0 < 1.

Since there is no nonconstant solution for the equation (2.3) with 0 < α ≤ λ/2, Proposition 2.3 implies that 2α = λ.

3.3 Some other 3-points chains

By collapsing 4, 5 and 6 cycles, we have obtained in Sections 3.1.3 and 3.2 the equality α = λ/2 for the three chains on the 3-point stick described in figure 3.5. The first two theorems in this section concern the variants (depending on a

Figure 3.5: Three chains on the 3-point stick. All edges have weight 1/2 except when marked otherwise. In all cases α = λ/2.

r r r

parameter p ∈ [0, 1)) described in Figure 3.6.

Figure 3.6: The families of Theorems 3.4 and 3.5, p ∈ [0, 1).

r r r

{1, 2, 3} defined by

Remark 3.3. Both Kp in Theorem 3.4 and Kp0 in Theorem 3.5 are reversible with respect to their stationary distributions.

To prove the above two theorems, we need the following elementary lemma.

Lemma 3.7. Consider the continuous function u : [0, ∞) → R defined by

u(s) =

The function u has the following properties:

∀ t ∈ [0, ∞), u(t) ≥ t − 1. (3.35)

∀ s, t ∈ [0, ∞) with s ≤ t and s + t ≤ 2, u(t) − u(s) ≤ t − s. (3.36)

∀ s, t ∈ [1, ∞) with s ≤ t, u(t) − u(s) ≥ t − s. (3.37)

Proof. The function s 7→ s log s − s + 1 has derivative s 7→ log s on (0, ∞). Hence it attains its minimum at s = 1. As the value at s = 1 is 0, (3.35) follows.

To prove (3.36), fix s ≥ 0 and set, for t ≥ s,

g(t) = u(t) − u(s) − (t − s)u0((t + s)/2)

= t log t − s log s − (t − s)(1 + log((t + s)/2)).

Compute the derivatives g0(t) = log

µ 2t t + s

t − s

t + s, g00(t) = s(s − t) t(t + s)2.

It follows that g is non-increasing on [s, ∞). Hence g(t) ≤ g(s) = 0 on [s, ∞), that is,

u(t) − u(s) ≤ (t − s)(1 + log((t + s)/2)).

The inequality (3.36) obviously follows when s + t ≤ 2.

Finally, (3.37) follows from the Mean Value Theorem applied to the function u since u0 ≥ 1 on [1, ∞).

Proof of Theorem 3.4. First observe that an easy computation gives λp = 1 − p.

By Corollary 2.4, it suffices to show that for β < λp/2, the system (2.4) has no non-constant positive solution u = (a, b, c). Suppose the contrary. By symmetry, we can assume that a ≥ c. There is no loss of generality to assume further the normalization

a2 + (2 − 2p)b2+ c2 = 4 − 2p. (3.38) Then (2.4) is equivalent to (using the function u defined at (3.34))

1 − pu(a) = a − b (3.39)

4βu(b) = 2b − a − c (3.40)

1 − pu(c) = c − b. (3.41)

We prove by considering two subcases, a > c and a = c.

Case 1: a > c. Subtract (3.41) from (3.39) to obtain

u(a) − u(c) = 1 − p

Now, add (3.39) divided by a to (3.41) divided by c and subtract (3.40) divided by b to obtain k = 2β/(1 − p) which, by hypothesis, is less than 1. Hence h is increasing. The left-hand side of (3.44) is negative since b < 1 by (3.43). Hence h(a/b)−h(b/c) < 0 and thus a/b < b/c or, equivalently,

ac < b2 < 1.

By (3.38) and (3.42), we have

4 − 2p = a2 + 2(1 − p)b2+ c2 > a2+ 2(1 − p)ac + c2

= (a + c)2− 2pac > 4 − 2pac > 4 − 2p,

a contradiction. Hence, we must have αp = λp/2 = (1 − p)/2. then b = 1, which contradicts the assumption that u is nonconstant.

Proof of Theorem 3.5. Referring to the family of chains in Theorem 3.5, the facts that αp = λp/2 when p = 0 and p = 1/2 are contained respectively in Theorem 3.4 and in Corollary 3.3. To prove αp < λp/2 when p 6= 0, 1/2, we use the criteria contained in Proposition 2.2. A simple computation yields

λp = 3 − p −p both observations, it is easy to see that

3 − 3p + 6p2− 4p3+p

1 + p2(−1 + 6p − 4p2) > 0, ∀p ∈ (0, 1),

which implies µ0p3) 6= 0 unless p = 0 or p = 1/2. By Proposition 2.2, we must have αp < λp/2 for p 6= 0, 1/2.

We end this section with the study of one of the most natural chain on a 3-point stick where transitions are to the left with probability q = 1 − p and to the right with probability p.

Theorem 3.6. For 0 < p < 1 and set q = 1 − p. Let Kp : {1, 2, 3} × {1, 2, 3} be the Markov kernel defined by

Kp =





q p 0 q 0 p 0 q p





with stationary distribution

µp

cp, cp(p/q), cp(p/q)2¢

, cp

1 + (p/q) + (p/q)2¢−1 .

Then the spectral gap λp and the logarithmic Sobolev constant αp are given by

λp = 1 −√

pq, αp = p − q 2(log p − log q). In particular, a minimizer of αp is ψ = (p/q, 1, q/p).

Remark 3.4. Let p ∈ (0, 1) and αp and λp be as in Theorem 3.6. Recall in Theorem 2.3 that αp ≤ 1/4 for p ∈ (0, 1) and the equality holds only if p = 1/2. A simple computation shows that λp ≥ 1/2 and the equality holds only if p = 1/2.

Combining both bounds, we have αp ≤ λp/2 and the equality holds only if p = 1/2.

Proof of Theorem 3.6. Since Kp is reversible, the spectral gap is obtained by a direct computation of the eigenvalues of Kp. For the logarithmic Sobolev constant,

we compare this chain with another 3-point chain By Proposition 2.4, it follows that

αp ≥ eαp. (3.46)

Next, on {0, 1}2, we consider the product chain (with weights (1/2, 1/2)) of two copies of 2-point asymmetric chain in Theorem 2.3. This product chain has transitions given by

K((0, 0), (0, 0)) = q, K((1, 1), (1, 1)) = p, K((0, 0), (0, 1)) = K((0, 0), (1, 0)) = p/2, K((1, 0), (1, 1)) = K((0, 1), (1, 1)) = p/2,

and

K((1, 1), (0, 1)) = K((1, 1), (1, 0)) = q/2, K((0, 1), (0, 0)) = K((1, 0), (0, 0)) = q/2, K((0, 1), (0, 1)) = K((1, 0), (1, 0)) = 1/2.

By Proposition 2.6 and Theorem 2.3, its logarithmic Sobolev constant is 2 log(p/q)p−q . This chain projects to the 3-point space {1, 2, 3} using the map

p : {0, 1}2 → {1, 2, 3}, (x, y) 7→ 1 + |x| + |y|

and the projected chain is eKp. Hence, by Proposition 2.5 and (3.46), we get αp ≥ eαp p − q

2(log p − log q). (3.47)

To show that this is in fact an equality, it suffices to find a good test function.

Letting ψ = (p/q, 1, q/p) derives αp Ep(ψ, ψ)

Lµp(ψ) = p − q 2(log p − log q). Thus αp = 2(log p−log q)p−q .

Remark 3.5. Fix p ∈ (0, 1) and let K and Kp be the Markov kernels in the proof of Theorem 3.6. As the proof shows, K collapses to Kp and the logarithmic Sobolev constant of Kp is the same as that of K. However, the spectral gap of Kp, which is equal to 1 −√

pq, is not the same as the spectral gap of K, which is equal to 1/2. The main reason is that the eigenfunction of K corresponding to eigenvalue 1/2 has different values at (0, 1) and (1, 0) if p 6= 1/2. This makes the projection p fail to collapse the eigenfunction onto the three point space {1, 2, 3}.

The following corollary is an observation based on the inequality (3.47) obtained in the proof of Theorem 3.6.

Corollary 3.4. Let p ∈ (0, 1) and set q = 1 − p. Consider the following Markov

Let αp and eαp be their logarithmic Sobolev constants. Then

αp = eαp = p − q 2 log(p/q).

In particular, ψ = (p/q, 1, q/p) is a minimizer for both constants.

Proof. By (3.47) and Theorem 3.6, it remains to show that ψ is a minimizer of eαp. By (2.6), the fact (3.45) derived in the proof of Theorem 3.6 implies

Eep(ψ, ψ) ≤ (ecp/cp)Ep(ψ, ψ), (ecp/cp)Lµp(ψ) ≤ Leµp(ψ).

Since ψ is not constant, taking the ratio the Dirichlet form to the entropy implies

αp = eαp Eep(ψ, ψ)

Leµp(ψ) Ep(ψ, ψ) Lµp(ψ) = αp.

Remark 3.6. Both Kp and eKp in Corollary 3.4 are reversible and the spectral gap eλp of eKp is equal to 1/2. Let eαp be the logarithmic Sobolev constant of eKp. By Corollary 3.4 and Theorem 2.3, eαp ≤ eλp/2 and the equality holds only if p = 1/2.

Appendix A

相關文件