Logarithmic Sobolev constants for some finite Markov chains

From the computation of the logarithmic Sobolev constants in Section 2.4, one can see that different models need different tricks. In this chapter, we concentrate on the calculation of the logarithmic Sobolev constant for the simple random walks on the n-cycle. In Section 3.1, we focus on the even cycles and explicitly determine their logarithmic Sobolev constants. Thereafter, an application for collapsing a cycle is introduced. In Section 3.2, we implement another trick to determine the logarithmic Sobolev constant of the 5-cycle.

3.1 The simple random walk on an even cycle

For n ≥ 2, consider a simple random walk on the n-cycle Z_n = {1, 2, ..., n}. Clearly, the corresponding Markov kernel K_n is given by K_n(x, x ± 1) = ¹₂ and the uniform distribution on Z_n is its unique stationary distribution.(For n = 2, we consider the case K(1, 2) = K(2, 1) = 1 and K(1, 1) = K(2, 2) = 0. By Corollary 2.7, α = ^λ₂ = 1.) Throughout this section, we assume that n ≥ 3.

Let λ_n and α_n be the spectral gap and logarithmic Sobolev constant of K_n. It has been shown in Example 1.1 that λ_n= 1 − cos(2π/n) and in Corollary 2.8 and Theorem 2.5 that

α3 = 1

2 log 2 < λ₃ 2 = 3

4, α4 = λ₄ 2 = 1

3.1.1 The main result

The following is the main result of this section. This is a joint work with Yuan-Chung Sheu and has been polished in [6].

Theorem 3.1. For n > 2, let K_n be the Markov kernel of the simple random walk on the n-cycle. Assume that n is even. Then the spectral gap λ_n= λ(K_n) and the logarithmic Sobolev constant α_n= α(K_n) satisfy α_n= λ_n/2 = ¹₂(1 − cos^2π_n).

The following is a simple application of the above theorem.

Corollary 3.1. For n ≥ 3, let K_n be a Markov kernel on Z_n defined by K_n(i, i − 1) = p, K_n(i, i) = r and K_n(i, i + 1) = q for i ∈ Z_n, where p + q + r = 1. Then the spectral gap λn and the logarithmic Sobolev constant αn satisfy αn = λn/2 =

1−r

2 (1 − cos^2π_n)

Proof. Let eK_n be the Markov of the simple random walk on Z_n and E and eE be the Dirichlet forms of K_n and eK_n. Obviously, both K_n and eK_n have the same stationary distribution, the uniform distribution on Zn. By Lemma 1.1, one has E(f, f ) = (1 − r) eE(f, f ) for any function f on Zn and then, by definition, λn= (1 − r)eλn and αn= (1 − r)eαn.

We will prove this theorem in the next subsection. Here, we consider first the ratio E(f, f )/L(f ) and, by studying the Dirichlet form, restrict the minimizer, if any, for the logarithmic Sobolev constant to a specific class of functions. For any function f = (f (1), ..., f (n)) = (x₁, ..., x_n), we have

L(f ) = 1 n

Xn i=1

x²_i log x²_i

kf k²₂ (3.1)

and

E(f, f ) = 1

2n(|x₁− x₂|²+ |x₂− x₃|²+ · · · + |x_n−1− x_n|²+ |x_n− x₁|²). (3.2)

It is obvious that the uniformity of the stationary distribution π_n of K_n implies the invariance of L(f ) under the permutation of the components of f . We now investigate the extreme value of E over all permutations on the components of f .

Consider the function

F (x) = |x₁− x₂|² + |x₂− x₃|²+ · · · + |x_n−1− x_n|² + |x_n− x₁|² (3.3)

where x = (x₁, x₂, ..., x_n) ∈ Rⁿ. To every x = (x₁, x₂, ..., x_n) with 0 ≤ x₁ ≤ x₂ ≤

· · · ≤ x_n, there corresponds an element ex ∈ Rⁿ given by the formula

e x =









(x₁, x₃, x₅, ..., x_2k+1, x_2k, ..., x₄, x₂) if n = 2k + 1 (x1, x3, x5, ..., x2k−1, x2k, ..., x4, x2) if n = 2k.

(3.4)

Denote by S_nthe set of all permutations on {1, 2, ..., n} and write θx = (x_θ(1), x_θ(2), ..., x_θ(n)) for θ ∈ S_n and x ∈ Rⁿ.

Proposition 3.1. For every x = (x₁, x₂, ..., x_n) with 0 ≤ x₁ ≤ x₂ ≤ · · · ≤ x_n, we have F (θx) ≥ F (ex) for all θ ∈ S_n.

Proof. We prove this by induction on n. There is nothing to prove in the case n = 2. Assume that it is also true for n = k. We consider the case n = k + 1 and fix x = (x₁, x₂, ..., x_k+1) where 0 ≤ x₁ ≤ x₂ ≤ · · · ≤ x_k+1.

Step1. Set y = (x₁, x₂, ..., x_k) and consider the corresponding vector ey given by (3.4). For every i = 1, 2, ..., k − 2, set

e y_i,i+2 =









(x₁, x₃, ..., x_i, x_k+1, x_i+2, ..., x₄, x₂) if i is odd (x1, x3, ..., xi+2, xk+1, xi, ..., x4, x2) if i is even.

(3.5)

Thus ey_i,i+2 is obtained by inserting x_k+1 in ey between x_i and x_i+2. We also set

y_1,2 = (x₁, x₃, ..., x₄, x₂, x_k+1)

and

y_k−1,k =









(x₁, x₃, ..., x_k, x_k+1, x_k−1, ..., x₄, x₂) if k is odd (x1, x3, ..., xk−1, xk+1, xk, ..., x4, x2) if k is even.

(3.6)

We claim that

F (ey_1,2) ≥ F (ey_k−1,k) (3.7) and

F (ey_i,i+2) ≥ F (ey_k−1,k) for all i = 1, 2, ..., k − 2. (3.8) Note that for 1 ≤ i ≤ k − 2, a simple computation shows

F (ey_i,i+2) = F (ey) + (x_i− x_k+1)²+ (x_k+1− x_i+2)²− (x_i− x_i+2)². (3.9)

Therefore for 1 ≤ i ≤ k − 4, we get

F (ey_i,i+2) − F (ey_i+2,i+4) = [(x_i− x_k+1)²+ (x_k+1− x_i+2)²− (x_i− x_i+2)²]

− [(xi+2− xk+1)²+ (xk+1− xi+4)² − (xi+2− xi+4)²]

= 2(x_k+1− x_i+2)(x_i+4− x_i) ≥ 0.

(3.10) Besides, we also have

F (eyk−2,k) − F (eyk−1,k) = [(xk+1− xk−2)² + (xk+1− xk)²− (xk−2− xk)²]

− [(x_k+1− x_k−1)²+ (x_k+1− x_k)²− (x_k− x_k−1)²]

= 2(xk+1− xk)(xk−1− xk−2) ≥ 0

(3.11)

and

F (eyk−3,k−1) − F (eyk−1,k) = 2(xk+1− xk−1)(xk− xk−3) ≥ 0. (3.12) Combining (3.10), (3.11) and (3.12) gives (3.8). To prove (3.7), it suffices to show

that F (ey_1,2) ≥ F (ey_1,3), whereas this follows easily from the following computation.

F (ey1,2) − F (ey1,3) = [(x1− xk+1)²+ (xk+1− x2)²− (x1− x2)²]

− [(x₁− x_k+1)²+ (x_k+1− x₃)²− (x₁ − x₃)²]

= 2(x_k+1− x₁)(x₃− x₂) ≥ 0.

Step2. We prove that for every θ ∈ S_n+1,

F (θx) ≥ F (ey_k−1,k) = F (ex). (3.13)

Fix θ ∈ S_n+1 and set c = θx. It loses no generality to write c = (..., x_i, x_k+1, x_j, ...) for some i < j and let z = (..., x_i, x_j, ...) ∈ Rⁿ be obtained by removing the component x_k+1 from the vector c. Then, for 1 ≤ j ≤ k − 2, we have

F (c) − F (eyj,j+2)

= [F (z) + (x_i− x_k+1)²+ (x_j − x_k+1)²− (x_i− x_j)²]

− [F (ey) + (xj − xk+1)² + (xk+1− xj+2)² − (xj − xj+2)²]

= F (z) − F (ey) + 2(x_k+1− x_j)(x_j+2− x_i) ≥ 0.

(3.14)

(The last inequality applies the inductive assumption that F (z) ≥ F (ey).) For j = k − 1, we have

F (c) − F (eyk−1,k)

= [F (z) + (x_i− x_k+1)²+ (x_k−1− x_k+1)²− (x_i− x_k−1)²]

− [F (ey) + (x_k− x_k+1)²+ (x_k+1− x_k−1)²− (x_k− x_k−1)²]

= F (z) − F (ey) + 2(x_k− x_i)(x_k+1− x_k−1) ≥ 0.

For j = k, we have

F (c) − F (ey_k−1,k)

= [F (z) + (xk− xk+1)²+ (xi− xk+1)² − (xi− xk)²]

− [F (ey) + (x_k− x_k+1)²+ (x_k+1− x_k−1)² − (x_k− x_k−1)²]

= F (z) − F (ey) + 2(x_k−1− x_i)(x_k+1− x_k) ≥ 0.

(3.15)

Therefore (3.13) follows from (3.14)-(3.15) and (3.8).

The following is a consequence of Proposition 3.1 and is critical in computing the logarithmic Sobolev constant of the simple random walk on the n-cycle.

Corollary 3.2. For n ≥ 3, let α_n be the logarithmic Sobolev constant of the simple random walk on the n cycle. Assume that there exists a positive non-constant function f such that αn = E(f, f )/L(f ). Let 0 ≤ x1 ≤ x2 ≤ · · · ≤ xn be the components of f and ef = (x1, x3, ..., x4, x2). Then the Euler-Lagrange equation (2.3) is satisfied with α = αn and u = ef . Furthermore, αn = E( ef , ef )/L( ef ).

3.1.2 Proof of Theorem 3.1

In this subsection, we dedicate in proving Theorem 3.1. Throughout this section, n is even and n ≥ 4. The way we prove Theorem 3.1 is first to verify by contradiction that there is no positive non-constant function u and α < ^λ₂ⁿ satisfying the Euler-Lagrange equation (2.3). Our main result then follows from Corollary 2.4. Before starting to prove the main result, we derive a series of lemmas using combinatorial arguments.

Define the shifting operator σ by

σ(x1, x2, ..., xn) = (xn, x1, x2, ..., xn−1), (3.16)

where x = (x₁, x₂, ..., x_n) ∈ Rⁿ. Set σ^j(x) = σ(σ^j−1(x)) for j ≥ 2 and write σ^−j for the inverse of σ^j.

Lemma 3.1. Consider a vector of the form

u = (x₁, x₃, ..., x_2k−1, x_2k, ..., x₄, x₂)

where x₁ ≤ x₂ ≤ ... ≤ x_2k and write σ^j(u) = ((σ^j(u))₁, (σ^j(u))₂, ..., (σ^j(u))_2k).

Then for every 1 ≤ j ≤ k − 1, we have

(σ^j(u))_i ≤ (σ^j(u))_2k−i+1, for i = 1, ..., k (3.17)

and

(σ^−j(u))_i ≥ (σ^−j(u))_2k−i+1, for i = 1, ..., k. (3.18)

Proof. Assume 1 ≤ j ≤ k − 1. Then we have

(σ^j(u))_i =











x_2(j−i+1), if 1 ≤ i ≤ j;

x_2(i−j)−1, if j + 1 ≤ i ≤ j + k;

x2k−2[i−(j+k+1)], if j + k + 1 ≤ i ≤ 2k.

Case 1: 1 ≤ i ≤ j ∧ (k − j). Since i ≤ (k − j), we get 2k − i + 1 ≥ k + j + 1 and (σ^j(u))2k−i+1 = x2(i+j), which implies

(σ^j(u))_i = x_2(j−i+1) ≤ x_2(i+j) = (σ^j(u))_2k−i+1.

Case 2: j ∨ (k − j) < i ≤ k. Note that (k − j) < i ≤ k implies k + 1 ≤ (2k − i + 1) ≤ (k + j). Hence, we have

(σ^j(u))i = x2(i−j)−1, (σ^j(u))2k−i+1= x2(2k−i−j)+1.

Since 2(2k − i − j) + 1 ≥ 2(i − j) − 1, we get (σ^j(u))_i ≤ (σ^j(u))_2k−i+1.

Case 3: j ∧ (k − j) < i ≤ j ∨ (k − j). It is obvious that only the situation j 6= k −j is needed to be considered. On one hand, if j < k −j, then j < i ≤ (k −j) and 2k − i + 1 ≥ j − k + 2k + 1 = k + j + 1. This implies

(σ^j(u))_i = x_2(i−j)−1 ≤ x_2(i+j) = (σ^j(u))_2k−i+1.

On the other hand, if k − j < j, then k − j < i ≤ j. By this fact, we have

(σ^j(u))_i = x_2(j−i+1) ≤ x2(2k−i−j)+1 = (σ^j(u))_2k−i+1.

Combining all above proves (3.17). The proof of (3.18) can be done by similar arguments.

Lemma 3.2. Let u = (u₁, u₂, ..., u_2k−1, u_2k) be a vector with u_i > 0 for all 1 ≤ i ≤ 2k. Assume that there exist two positive constants, c and d, such that

2ui− (ui−1+ ui+1) = cuilog du²_i (3.19)

for all i = 1, ..., 2k.(Here we write u₀ = u_2k and u_2k+1 = u₁.) Then:

(a) If u_i ≤ u_2k−i+1 for all 1 ≤ i ≤ k, then we have

u²₁− u²_2k+ u²_k− u²_k+1 ≥ c[(u²₁+ · · · + u²_k) − (u²_k+1+ · · · + u²_2k)].

(b) If u_i ≥ u_2k−i+1 for all 1 ≤ i ≤ k, then we have

u²_2k− u²₁+ u²_k+1− u²_k ≥ c[(u²_k+1+ · · · + u²_2k) − (u²₁+ · · · + u²_k)].

Proof. For (a), assume that ui ≤ u2k−i+1 for all 1 ≤ i ≤ k. For every 1 ≤ i ≤ k, we rewrite (3.19) as

2 − u_i−1+ u_i+1

u_i = c log du²_i.

Then a simple computation implies

Hence, by (3.20), we have

(u_iu_2k−i+2− u_i−1u_2k−i+1) + (u_iu_2k−i− u_i+1u_2k−i+1) ≥ c(u²_i − u²_2k−i+1)

for all i = 1, ..., k. The desired identity is obtained by summing up both sides of the above inequality over all 1 ≤ i ≤ k.

For (b), assume that ui ≥ u2k−i+1for all 1 ≤ i ≤ k. For every i, set vi = u2k−i+1. Then our result follows by applying (a) to the vector v = (v1, v2, ..., v2k).

Lemma 3.3. Consider the following k × k matrices:

A =

and

The proof of (b) is the same as that of (a) except the replacement of θ_l with

2lπ 2k+1.

Lemma 3.4. (a) Consider the following system of inequalities:









Aj − Aj+1 ≥ 4t(A1+ · · · + Aj), j = 1, ..., k − 1 A_k ≥ 2t(A₁+ · · · + A_k).

(3.21)

If t < ¹₂(1 − cos_2k^π ), then the system (3.21) has no solution (A₁, A₂, ..., A_k) with A₁ < 0.

(b) Consider the following system of inequalities:









A_j − A_j+1 ≥ 4t(A₁+ · · · + A_j), j = 1, ..., k − 1 A_k ≥ 4t(A₁+ · · · + A_k).

(3.22)

If t < ¹₂(1 − cos_2k+1^π ), then the system (3.22) has no solution (A₁, A₂, ..., A_k) with A1 < 0.

Proof. For (a), let f1(t) = 2 − 4t and g1(t) = 4t. For every 1 ≤ l ≤ k − 1, put

f_l+1(t) = (1 − 4t)f_l(t) − g_l(t) (3.23)

and

g_l+1(t) = 4tf_l(t) + g_l(t). (3.24) Clearly, (3.23) and (3.24) imply

gl+1(t) − gl(t) = 4tfl(t) = fl(t) − gl(t) − fl+1(t).

Hence we have fl(t) = gl+1(t) + fl+1(t) for 1 ≤ l ≤ k − 1. By this fact, we obtain, for 2 ≤ l ≤ k − 1,

f_l+1(t) = (2 − 4t)f_l(t) − (f_l(t) + g_l(t))

= (2 − 4t)fl(t) − fl−1(t).

Note that

f₁(t) = 2 − 4t, f₂(t) = (1 − 4t)f₁(t) − g₁(t) = (2 − 4t)²− 2,

and, therefore,

f_l(t) = det(M_l− 4tI_l), 1 ≤ l ≤ k (3.25) where I_l is the l × l identity matrix and M_l is the l × l matrix of the same form as that in Lemma 3.3(a).

Assume that t < ¹₂(1 − cos_2k^π) and (A1, A2, ..., Ak) satisfies the system of in-equalities (3.21). Since t < ¹₂(1 − cos_2l^π) for 1 ≤ l ≤ k, Lemma 3.3(a) and (3.25) imply that f_l(t) > 0 for all l = 1, 2, ..., k.

For 1 ≤ i ≤ k − 1, we have, by (3.21),

Ak−i− Ak−i+1 ≥ 4t(A1+ · · · + Ak−i).

We claim that

f_j(t)A_k−j+1 ≥ g_j(t)(A₁+ · · · + A_k−j), ∀1 ≤ j ≤ k. (3.26)

Clearly (3.26) holds for j = 1. Suppose that it also holds for some i with 1 ≤ i ≤ k − 1. Since fi(t) > 0, we get

f_i(t)A_k−i = f_i(t)(A_k−i− A_k−i+1) + f_i(t)A_k−i+1 ≥ (4tf_i(t) + g_i(t))(A₁+ · · · + A_k−i)

= g_i+1(t)(A₁+ · · · + A_k−i−1) + (4tf_i(t) + g_i(t))A_k−i.

The above inequality implies that (3.26) also holds for j = i + 1 and hence is true for 1 ≤ j ≤ k. Plugging j = k into (3.26) gives f_k(t)A₁ ≥ 0. Since f_k(t) > 0, we have A₁ ≥ 0. This proves part (a).

The same line of reasoning as above applies for part (b) and the proof goes word for word except the replacement of f₁(t) with 1 − 4t.

Proof of Theorem 3.1. By Corollary 2.4, it suffices to show that there is no positive non-constant function u and 0 < β < λn/2 satisfying (2.4). We prove this fact by contradiction. Suppose the inverse, that is, (2.4) is satisfied for some β < ^λ₂ =

2(1 − cos^2π_n) and a positive non-constant function u. Without loss of generality, we assume P

i(u(i))² = 1. By Corollary 3.2, we may assume further that u = (x₁, x₃, ..., x_n−1, x_n, ..., x₄, x₂), where 0 < x₁ ≤ x₂ ≤ · · · ≤ x_n and x₁ < x_n. In this case, the Euler-Lagrange equation in (2.4) is given by

2x_i− (x⁽¹⁾_i + x⁽²⁾_i ) = 2βx_ilog nx²_i, 1 ≤ i ≤ n.

where x⁽¹⁾_i and x⁽²⁾_i are the two nearest neighbors of x_i.

Recall the shifting operator σ defined in (3.16) and σ^j = σ(σ^j−1) for j ≥ 2.

Note that we may write n = 4k or n = 4k + 2. For j = 1, ..., k, we have σ^j(f ) = (x_2j, ..., x₂, x₁, ..., x_n−2j−1, x_n−2j+1, ..., x_n−1, x_n, ..., x_2j+2) and

σ^−j(f ) = (x_2j+1, ..., x_n−1, x_n, ..., x_n−2j+2, x_n−2j, ..., x₂, x₁, ..., x_2j−1).

By Lemma 3.1 and Lemma 3.2(a), we get

(x²_2j − x²_2j+2+ x²_n−2j−1− x²_n−2j+1)

≥2β[(x²₂+ x²₄+ · · · + x²_2j+ x²₁+ x²₃+ · · · + x²_n−2j−1)

− (x²_n−2j+1+ x²_n−2j+3+ · · · + x²_n−1+ x²_2j+2 + x²_2j+4+ · · · + x²_n)].

Similarly Lemma 3.1 and Lemma 3.2(b) imply that

(x²_2j−1− x²_2j+1+ x²_n−2j− x²_n−2j+2)

≥2β[(x²₁+ x²₃+ · · · + x²_2j−1+ x²₂+ x²₄+ · · · + x²_n−2j)

− (x²_2j+1+ x²_2j+3+ · · · + x²_n−1+ x²_n−2j+2+ x²_n−2j+4+ · · · + x²_n)].

Note that n − 2j − 1 ≥ 2j + 1 and n − 2j ≥ 2j + 2 for 1 ≤ j ≤ k. Summing up the above two inequalities gives

(x²_2j−1 + x²_2j − x²_2j+1− x²_2j+2) + (x²_n−2j−1+ x²_n−2j− x²_n−2j+1− x²_n−2j+2)

≥ 4β[(x²₁ + x²₂+ · · · + x²_2j) − (x²_n−2j+1+ x²_n−2j+2+ · · · + x²_n)].

Letting Ai = x²_2i−1+ x²_2i− x²_n−2i+1− x²_n−2i+2 for 1 ≤ i ≤ k implies, for n = 4k,







A_j− A_j+1 ≥ 4β(A₁+ A₂+ · · · + A_j), j = 1, ..., k − 1 A_k≥ 2β(A₁+ A₂+ · · · + A_k)

and for n = 4k + 2,







A_j− A_j+1 ≥ 4β(A₁+ A₂+ · · · + A_j), j = 1, ..., k − 1 A_k≥ 4β(A₁+ A₂+ · · · + A_k)

Note that β < ¹₂(1 − cos^2π_n) and A₁ = x²₁ + x²₂− x²_n−1− x²_n ≤ x²₁− x²_n < 0. This contradicts Lemma 3.4.

3.1.3 An application: collapse of cycles and product of sticks

In this section, we discuss some applications of Theorem 3.1. This is a joint work with Laurent Saloff-Coste and Wai-Wai Liu in [5]. We first consider the following two ways of collapsing even cycles.

1. Collapsing the 2n-cycle to the n-stick with loops at the ends. Fix n ≥ 2 and let K₁ and K₂ be Markov kernels on Z_n and Z_2n defined by

K1(0, 0) = K1(n − 1, n − 1) = K1(i, i + 1) = K1(i + 1, i) = 1/2, for all i = 0, ..., n − 2, and

K₂(i, i + 1) = K₂(i, i − 1) = 1/2, ∀i = 0, ..., 2n − 1.

Let p : Z_2n → Z_n be a surjective map defined by p(i) = p(2n − 1 − i) = i for i = 0, ..., n − 1. A simple computation(checking the requirement in Proposition 2.5) shows that the Markov kernel K2 collapses to K1 via the projection p. See Figure 3.1.

Figure 3.1: The 14-cycle collapses to the 7-stick with loops at the ends. All edges have weight 1/2. eigenvalue 1 − λ₂ with multiplicity 2 and the two dimensional eigenspace contains the function f (x) = cos(^π_n(x + ¹₂)), which has the property f (x) = f (2n − 1 − x).

2. Collapsing the 2n-cycle to the n + 1-stick with reflecting barriers. Fix n ≥ 2 and let K₂ be the simple random walk on Z_2n. Consider a Markov kernel K₁

on Z_n+1 given by K₁(0, 1) = K₁(n, n − 1) = 1 and K₁(i, i + 1) = K₁(i, i − 1) = 1/2 for all 1 ≤ i ≤ n − 1. Let p : Z2n → Zn+1 be a map defined by p(i) = p(2n − i) = i for 0 ≤ i ≤ n. Then K1 is obtained by collapsing K2 through the projection p.

See Figure 3.2.

Figure 3.2: The 14-cycle collapses to a 8-stick with reflecting barriers. All edges have weight 1/2 except those marked which have weight 1.

r r r r r r r

It is easy to check that f (x) = cos^πx_n is an eigenvector of K₂ with corresponding eigenvalue cos^π_n. The same line of reasoning as in case 1 implies that both K₁ and K₂ have the same spectral gap. Then, by Proposition 2.5 and Theorem 3.1, we have the following proposition.

Proposition 3.3. Let n ≥ 2 and K be a Markov kernel on Zn defined by K(0, 1) = K(n − 1, n − 2) = 1 and K(i, i + 1) = K(i, i − 1) = 1/2 for all 1 ≤ i ≤ n − 2. Then the spectral gap λ and the logarithmic Sobolev constant α are given by 2α = λ = 1 − cos_n−1^π .

Proof. Note that the case n = 2 is part of the result in Corollary 2.7 and the case n > 2 is given by the discussion in front of this proposition.

3. Product of sticks. In this case, we consider an application of Proposition 3.2. Fix d ≥ 1 and let b = (b₁, ..., b_d) be an integer vector, where b_i ≥ 2 for all

1 ≤ i ≤ d. In Z^d with basis {e₁, ..., e_d}, consider a rectangular box

R_b = {x = (x₁, ..., x_d) ∈ Z^d: x_i ∈ {1, ..., b_i}, 1 ≤ i ≤ d}. (3.27) The first application deals with a Markov kernel K on R_b, where

∀x, y ∈ R_b, x 6= y, K(x, y) = 1

All edges have weight 1/4 except the corner loops which have weight 1/2. The stationary measure is uniform.

for all 0 ≤ j ≤ b_i − 2. By Proposition 2.6, one may generalize Proposition 3.2 as follows.

Theorem 3.2. Let d ≥ 1 be an integer and b = (b₁, ..., b_d) be an integer vector with 2 ≤ b₁ ≤ · · · ≤ b_d. Let R_b be a subset of Z^d defined in (3.27) and K be a

Markov kernel on R_b given by (3.28) and (3.29). Then the spectral gap λ and the logarithmic Sobolev constant α of K satisfy

2α = λ = 1 − cos(π/b_d)

d .

3.2 The simple random walk on the 5-cycle

Referring to Theorem 3.1 and Corollary 2.7, the logarithmic Sobolev constant for the simple random walk on an even cycle is a half of the spectral gap but this is not true for the simple random walk on the 3-cycle. It is not sure how the spectral gap and the logarithmic Sobolev constant are related if the simple random walk is considered on an odd cycle. A numerical result for the cases n = 5, 7 and 9, where n denotes the n-cycle, shows that the logarithmic Sobolev constant should be a half of the spectral gap. However, a mathematical proof is not available yet.

The goal of this section is to clarify the fact α = λ/2 for the case n = 5, whereas a similar proof is proposed by Wai-Wai Liu, Laurent Saloff-Coste and the author of this dissertation.

Theorem 3.3. Let K be the Markov kernel of the simple random walk on the 5-cycle and λ and α be the spectral gap and the logarithmic Sobolev constant of K.

Then 2α = λ = 1 − cos^2π₅ .

Remark 3.1. In the section, what will be proved is a stronger result than the above theorem which says E(f, f ) ≥ ^λ₂L(f ) for all functions f and the equality holds if and only if f is constant.

Before proving this theorem, we consider the following application.

Corollary 3.3. Let eK be a Markov kernel on Z₃ given by

K(0, 0) = ee K(0, 1) = eK(1, 2) = eK(1, 0) = 1/2, K(2, 1) = 1,e

and eλ and eα be the spectral gap and logarithmic Sobolev constant of eK. Then 2eα = eλ = 1 − cos^2π₅ .

Proof. Let K be the Markov kernel of the simple random walk on the 5-cycle with spectral gap λ and logarithmic Sobolev constant α. Consider the map p : Z₅ → Z₃ defined by p(i) = p(4 − i) = i for i = 0, 1, 2. It is clear that K collapses to eK through p. See Figure 3.4.

Figure 3.4: The 5 cycle collapses to the 3-point stick with a loop at one end. All edges have weight 1/2 except marked otherwise.

r eigenvalue 1 − λ. It is easy to see that the function f |_{0,1,2}is also an eigenfunction of eK. Thus λ = eλ = 1−cos^2π₅ and the identity eα = eλ/2 is then proved by Theorem 3.3 and Proposition 2.5.

To prove Theorem 3.3, we need the following two lemmas.

Lemma 3.5. Consider the function g_β(t) = 2t − 4βt log t for t > 0 and g_β(0) = 0.

Proof. Fix t > 0, β ≥ 0 and let h be a function on [0, t] defined by

Then the first derivative of h is given by h⁰(s) = 1 − 2s that h is strictly decreasing in [0, t] and hence proves this lemma since h(t) = 0.

Lemma 3.6. For β > 0, let g_β be the function defined in Lemma 3.5 and D_β be

Remark 3.2. Note that F (β, t) is well-defined on D_β since one has, by Lemma 3.5, (g_β(t) − s) − (g_β(s) − t) ≥ (t − s)

where f₁(s, t) = log¡_t+s

¢. This implies, by using Lemma 3.5 twice,

Fβ(s, t) > [gβ(t) − gβ(s) + t − s][2 − 4β − 4βf2(s, t)] − (t − s)

> (t − s) {[3 − 4β − 4βf₁(s, t)][2 − 4β − 4βf₂(s, t)] − 1} ,

(3.31)

where

f₂(s, t) = log

µg_β(t) + g_β(s) − (t + s) 2

= log(t + s) − 4β(t log t + s log s)

2 .

Note that the second inequality in (3.31) uses the convexity of the map u 7→ u log u for u > 0 to get

f₂(s, t) < log

µt + s 2

¶ + log

1 − 4β log

µt + s 2

¶¶

, (3.32)

and then applies the fact r − 4βr log r < e^1−2β^2β for all r > 0 and 0 < β ≤ ¹₂(1 − cos^2π₅ ). A simple computation shows that

(2 − 4β)(3 − 4β) − 1 = 16β²− 20β + 5 ≥ 0,

for 0 ≤ β ≤ ¹₂(1 − cos^2π₅ ). To finish this proof, it suffices to show that

(2 − 4β)f1(s, t) + [3 − 4β − 4βf1(s, t)] f2(s, t) ≤ 0.

Since (t + s)/2 < 1, it remains to prove, by using (3.32), that

h(x) = (2 − 4β)x + (3 − 4β − 4βx)[x + log(1 − 4βx)] < 0, ∀x < 0.

Taking the first derivative of h, we get

h⁰(x) = 5 − 12β + 4β(4β − 2)

1 − 4βx − 4β[2x + log(1 − 4βx)]

> 5 − 12β + 4β(4β − 2) = 16β²− 20β + 5 ≥ 0,

where the first inequality is implies by the facts that the mapping x 7→ ^4β(4β−2)_1−4βx for x < 0 is decreasing and

∀x < 0, 2x + log(1 − 4βx) ≤ 2x(1 − 2β) < 0.

Therefore, h is strictly increasing. In addition to the fact h(0) = 0, we get h(x) < 0 for x < 0.

Proof of Theorem 3.3. By Proposition 2.2, one always has 0 < α ≤ λ = ¹₂(1 − cos^2π₅ ). We prove this theorem by showing that there is no nonconstant solution u for the Euler-Lagrange equation

2αu log(u/kuk₂) = (I − K)u.

Assume the inverse, that is, the above equation is satisfied with nonconstant u whose entries are 0 < x₀ ≤ x₁ ≤ x₂ ≤ x₃ ≤ x₄. There is not loss of generality to assume that kuk2 = 1, or equivalently, x²₀ + x²₁ + · · · + x²₄ = 5. By Corollary 3.2, we may assume further that u = (x0, x2, x4, x3, x1). In the above setting, the minimizing equation is equal to

x₁ + x₂ = g_α(x₀), x₀+ x₃ = g_α(x₁), x₀+ x₄ = g_α(x₂), x₁ + x₄ = g_α(x₃), x₂+ x₃ = g_α(x₄),

(3.33)

where g_α(x) = 2x − 4αx log x.

Note that the assumption of nonconstant u derives x0 < x4, and the nor-malization of u implies x0 < 1. Since gα is a concave function with derivative g_α⁰(1) = 2 − 4α > 0, we have gα(x) ∈ (0, 2) for x ∈ (0, 1). On one hand, by this observation, the equality x₁+ x₂ = g_α(x₀) implies x₁ < 1 and then the identity x₀+ x₃ = g_α(x₁) implies x₀+ x₃ ≤ 2. On the other hand, by (3.33), one can obtain the following equation

F_α(x₀, x₂) = g_α(g_α(x₂) − x₀) − g_α(g_α(x₀) − x₂) − (x₂− x₀) = 0.

Since u is a solution of (3.33), it is clear that (x₀, x₂) ∈ D_α, the region defined in (3.30). Thus, by Lemma 3.6, we have x0 = x1 = x2 and, by the first equality of (3.33), we get x1 = 1. This contradicts x0 < 1.

Since there is no nonconstant solution for the equation (2.3) with 0 < α ≤ λ/2, Proposition 2.3 implies that 2α = λ.

3.3 Some other 3-points chains

By collapsing 4, 5 and 6 cycles, we have obtained in Sections 3.1.3 and 3.2 the equality α = λ/2 for the three chains on the 3-point stick described in figure 3.5. The first two theorems in this section concern the variants (depending on a

Figure 3.5: Three chains on the 3-point stick. All edges have weight 1/2 except when marked otherwise. In all cases α = λ/2.

r r r

parameter p ∈ [0, 1)) described in Figure 3.6.

Figure 3.6: The families of Theorems 3.4 and 3.5, p ∈ [0, 1).

r r r

{1, 2, 3} defined by

Remark 3.3. Both K_p in Theorem 3.4 and K_p⁰ in Theorem 3.5 are reversible with respect to their stationary distributions.

To prove the above two theorems, we need the following elementary lemma.

Lemma 3.7. Consider the continuous function u : [0, ∞) → R defined by

u(s) =

The function u has the following properties:

∀ t ∈ [0, ∞), u(t) ≥ t − 1. (3.35)

∀ s, t ∈ [0, ∞) with s ≤ t and s + t ≤ 2, u(t) − u(s) ≤ t − s. (3.36)

∀ s, t ∈ [1, ∞) with s ≤ t, u(t) − u(s) ≥ t − s. (3.37)

Proof. The function s 7→ s log s − s + 1 has derivative s 7→ log s on (0, ∞). Hence it attains its minimum at s = 1. As the value at s = 1 is 0, (3.35) follows.

To prove (3.36), fix s ≥ 0 and set, for t ≥ s,

g(t) = u(t) − u(s) − (t − s)u⁰((t + s)/2)

= t log t − s log s − (t − s)(1 + log((t + s)/2)).

Compute the derivatives g⁰(t) = log

µ 2t t + s

− t − s

t + s, g⁰⁰(t) = s(s − t) t(t + s)².

It follows that g is non-increasing on [s, ∞). Hence g(t) ≤ g(s) = 0 on [s, ∞), that is,

u(t) − u(s) ≤ (t − s)(1 + log((t + s)/2)).

The inequality (3.36) obviously follows when s + t ≤ 2.

Finally, (3.37) follows from the Mean Value Theorem applied to the function u since u⁰ ≥ 1 on [1, ∞).

Proof of Theorem 3.4. First observe that an easy computation gives λ_p = 1 − p.

By Corollary 2.4, it suffices to show that for β < λ_p/2, the system (2.4) has no non-constant positive solution u = (a, b, c). Suppose the contrary. By symmetry, we can assume that a ≥ c. There is no loss of generality to assume further the normalization

a² + (2 − 2p)b²+ c² = 4 − 2p. (3.38) Then (2.4) is equivalent to (using the function u defined at (3.34))

2β

1 − pu(a) = a − b (3.39)

4βu(b) = 2b − a − c (3.40)

2β

1 − pu(c) = c − b. (3.41)

We prove by considering two subcases, a > c and a = c.

Case 1: a > c. Subtract (3.41) from (3.39) to obtain

u(a) − u(c) = 1 − p

Now, add (3.39) divided by a to (3.41) divided by c and subtract (3.40) divided by b to obtain k = 2β/(1 − p) which, by hypothesis, is less than 1. Hence h is increasing. The left-hand side of (3.44) is negative since b < 1 by (3.43). Hence h(a/b)−h(b/c) < 0 and thus a/b < b/c or, equivalently,

ac < b² < 1.

By (3.38) and (3.42), we have

4 − 2p = a² + 2(1 − p)b²+ c² > a²+ 2(1 − p)ac + c²

= (a + c)²− 2pac > 4 − 2pac > 4 − 2p,

a contradiction. Hence, we must have α_p = λ_p/2 = (1 − p)/2. then b = 1, which contradicts the assumption that u is nonconstant.

Proof of Theorem 3.5. Referring to the family of chains in Theorem 3.5, the facts that α_p = λ_p/2 when p = 0 and p = 1/2 are contained respectively in Theorem 3.4 and in Corollary 3.3. To prove α_p < λ_p/2 when p 6= 0, 1/2, we use the criteria contained in Proposition 2.2. A simple computation yields

λ_p = 3 − p −p both observations, it is easy to see that

3 − 3p + 6p²− 4p³+p

1 + p²(−1 + 6p − 4p²) > 0, ∀p ∈ (0, 1),

which implies µ⁰_p(φ³) 6= 0 unless p = 0 or p = 1/2. By Proposition 2.2, we must have αp < λp/2 for p 6= 0, 1/2.

We end this section with the study of one of the most natural chain on a 3-point stick where transitions are to the left with probability q = 1 − p and to the right with probability p.

Theorem 3.6. For 0 < p < 1 and set q = 1 − p. Let K_p : {1, 2, 3} × {1, 2, 3} be the Markov kernel defined by

K_p =







q p 0 q 0 p 0 q p







with stationary distribution

µ_p =¡

c_p, c_p(p/q), c_p(p/q)²¢

, c_p =¡

1 + (p/q) + (p/q)²¢₋₁ .

Then the spectral gap λ_p and the logarithmic Sobolev constant α_p are given by

λ_p = 1 −√

pq, α_p = p − q 2(log p − log q). In particular, a minimizer of αp is ψ = (p/q, 1, q/p).

Remark 3.4. Let p ∈ (0, 1) and α_p and λ_p be as in Theorem 3.6. Recall in Theorem 2.3 that α_p ≤ 1/4 for p ∈ (0, 1) and the equality holds only if p = 1/2. A simple computation shows that λ_p ≥ 1/2 and the equality holds only if p = 1/2.

Combining both bounds, we have α_p ≤ λ_p/2 and the equality holds only if p = 1/2.

Proof of Theorem 3.6. Since Kp is reversible, the spectral gap is obtained by a direct computation of the eigenvalues of Kp. For the logarithmic Sobolev constant,

we compare this chain with another 3-point chain By Proposition 2.4, it follows that

αp ≥ eαp. (3.46)

Next, on {0, 1}², we consider the product chain (with weights (1/2, 1/2)) of two copies of 2-point asymmetric chain in Theorem 2.3. This product chain has transitions given by

K((0, 0), (0, 0)) = q, K((1, 1), (1, 1)) = p, K((0, 0), (0, 1)) = K((0, 0), (1, 0)) = p/2, K((1, 0), (1, 1)) = K((0, 1), (1, 1)) = p/2,

and

K((1, 1), (0, 1)) = K((1, 1), (1, 0)) = q/2, K((0, 1), (0, 0)) = K((1, 0), (0, 0)) = q/2, K((0, 1), (0, 1)) = K((1, 0), (1, 0)) = 1/2.

By Proposition 2.6 and Theorem 2.3, its logarithmic Sobolev constant is _{2 log(p/q)}^p−q . This chain projects to the 3-point space {1, 2, 3} using the map

p : {0, 1}² → {1, 2, 3}, (x, y) 7→ 1 + |x| + |y|

and the projected chain is eK_p. Hence, by Proposition 2.5 and (3.46), we get α_p ≥ eα_p ≥ p − q

2(log p − log q). (3.47)

To show that this is in fact an equality, it suffices to find a good test function.

Letting ψ = (p/q, 1, q/p) derives α_p ≤ E_p(ψ, ψ)

L_µ_p(ψ) = p − q 2(log p − log q). Thus α_p = 2(log p−log q)^p−q .

Remark 3.5. Fix p ∈ (0, 1) and let K and K_p be the Markov kernels in the proof of Theorem 3.6. As the proof shows, K collapses to K_p and the logarithmic Sobolev constant of K_p is the same as that of K. However, the spectral gap of K_p, which is equal to 1 −√

pq, is not the same as the spectral gap of K, which is equal to 1/2. The main reason is that the eigenfunction of K corresponding to eigenvalue 1/2 has different values at (0, 1) and (1, 0) if p 6= 1/2. This makes the projection p fail to collapse the eigenfunction onto the three point space {1, 2, 3}.

The following corollary is an observation based on the inequality (3.47) obtained in the proof of Theorem 3.6.

Corollary 3.4. Let p ∈ (0, 1) and set q = 1 − p. Consider the following Markov

Let α_p and eα_p be their logarithmic Sobolev constants. Then

α_p = eα_p = p − q 2 log(p/q).

In particular, ψ = (p/q, 1, q/p) is a minimizer for both constants.

Proof. By (3.47) and Theorem 3.6, it remains to show that ψ is a minimizer of eαp. By (2.6), the fact (3.45) derived in the proof of Theorem 3.6 implies

Ee_p(ψ, ψ) ≤ (ec_p/c_p)E_p(ψ, ψ), (ec_p/c_p)L_µ_p(ψ) ≤ L_e_µ_p(ψ).

Since ψ is not constant, taking the ratio the Dirichlet form to the entropy implies

α_p = eα_p ≤ Ee_p(ψ, ψ)

L_e_µ_p(ψ) ≤ E_p(ψ, ψ) L_µ_p(ψ) = α_p.

Remark 3.6. Both K_p and eK_p in Corollary 3.4 are reversible and the spectral gap eλ_p of eK_p is equal to 1/2. Let eα_p be the logarithmic Sobolev constant of eK_p. By Corollary 3.4 and Theorem 2.3, eα_p ≤ eλ_p/2 and the equality holds only if p = 1/2.

Appendix A

在文檔中有限馬可夫鏈的對數索柏列夫常數 (頁 54-85)